S. Kusuoka 

T. Maruyama (Eds.) 



Advances in 

MATHEMATICAL 

ECONOMICS 

Volume 5 




Springer 



Advances in 

MATHEMATICAL 

ECONOMICS 



Managing Editors 

Shigeo Kusuoka Toru Maruyama 

University of Tokyo Keio Univeristy 

Tokyo, JAPAN Tokyo, JAPAN 



Jean-Michel Grandmont Norio Kikuchi 

Keio Univeristy 
Yokohama, JAPAN 



Editors 

Robert Anderson 

University of California, 

Berkeley 

Berkeley, U.S.A. 

Charles Castaing 

Universite Montpellier II 
Montpellier, FRANCE 
Prank H. Clarke 
Universite de Lyon I 
Villeurbanne, FRANCE 
Gerard Debreu 

University of California, 

Berkeley 

Berkeley, U.S.A. 

Egbert Dierker 

University of Vienna 
Vienna, AUSTRIA 
Darrell Duffie 

Stanford University 
Stanford, U.S.A. 
Lawrence C. Evans 
University of California, 
Berkeley 
Berkeley, U.S.A. 

Takao Pujimoto 
Kagawa University 
Kagawa, JAPAN 



CREST-CNRS 
Malakoff, FRANCE 
Norimichi Hirano 
Yokohama National 
University 
Yokohama, JAPAN 
Leonid Hurwicz 

University of Minnesota 
Minneapolis, U.S.A. 
Tatsuro Ichiishi 

Ohio State University 
Ohio, U.S.A. 
Alexander Ioffe 
Israel Institute of 
Technology 
Haifa, ISRAEL 
Seiichi Iwamoto 
Kyushu University 
Fukuoka, JAPAN 
Kazuya Kamiya 
University of Tokyo 
Tokyo, JAPAN 
Kunio Kawamata 
Keio Univeristy 
Tokyo, JAPAN 



Hiroshi Matano 

University of Tokyo 
Tokyo, JAPAN 

Kazuo Nishimura 

Kyoto University 
Kyoto, JAPAN 

Marcel K. Richter 

University of Minnesota 
Minneapolis, U.S.A. 

Yoichiro Takahashi 

Kyoto University 
Kyoto, JAPAN 

Michel Valadier 

Universite Montpellier II 
Montpellier, FRANCE 

Akira Yamazaki 

Hitotsubashi University 
Tokyo, JAPAN 

Makoto Yano 

Keio Univeristy 
Tokyo, JAPAN 



Aims and Scope. The project is to publish Advances in Mathemat- 
ical Economics once a year under the auspices of the Research Cen- 
ter of Mathematical Economics. It is designed to bring together those 
mathematicians who are seriously interested in obtaining new challeng- 
ing stimuli from economic theories and those economists who are seeking 
effective mathematical tools for their research. 

The scope of Advances in Mathematical Economics includes, but is 
not limited to, the following fields: 

— Economic theories in various fields baised on rigorous mathematical 
reasoning. 

— Mathematical methods (e.g., analysis, algebra, geometry, probability) 
motivated by economic theories. 

— Mathematical results of potential relevance to economic theory. 

— Historical study of mathematical economics. 

Authors are asked to develop their original results as fully as pos- 
sible and also to give a clear-cut expository overview of the problem 
under discussion. Consequently, we will also invite articles which might 
be considered too long for publication in journals. 



Springer 

Tokyo 
Berlin 
Heidelberg 
New York 
Hong Kong 
London 
Milan 
Palis 




S. Kusuoka, T. Maruyama (Eds.) 



Advances in 

Mathematical Economics 

Volume 5 




Springer 




Shigeo Kusuoka 
Professor 

Graduate School of Mathematical Sciences 
University of Tokyo 
3-8-1 Komaba, Meguro-ku 
Tokyo, 153-0041 Japan 



Toru Maruyama 
Professor 

Department of Economics 
Keio University 
2-15-45 Mita, Minato-ku 
Tokyo, 108-8345 Japan 



ISBN 4-431-00003-8 Springer- Verlag Tokyo Berlin Heidelberg New York 
Printed on acid-free paper 

Springer- Verlag is a company in the BertelsmannSpringer publishing group. 
©Springer- Verlag Tokyo 2003 
Printed in Japan 

This work is subject to copyright. All rights are reserved, whether the whole or 
part of the material is concerned, specifically the rights of translation, reprint- 
ing, reuse of illustrations, recitation, broadcasting, reproduction on microfilms 
or in other ways, and storage in data banks. The use of registered names, 
trademarks, etc. in this publication does not imply, even in the absence of a 
specific statement, that such names are exempt from the relevant protective 
laws and regulations and therefore free for general use. 



Camera-ready copy prepared from the authors’ DT^files. 
Printed and bound by Hirakawa Kogyosha, Japan. 

SPIN: 10898895 




Table of Contents 



Research Articles 
G. Carlier 

Duality and existence for a class of mass 
transportation problems and economic applications 1 



C. Castaing, A. G. Ibrahim 

Functional evolution equations governed by 
m-accretive operators 23 

T. Fujimoto, J. A. Silva, A. Villar 

Nonlinear generalizations of theorems on 

inverse— positive matrices 55 

L. Hurwicz, M. K. Richter 

Implicit functions and diffeomorphisms without 65 

L. Hurwicz, M. K. Richter 

Optimization and Lagrange multipliers: non-C^ 
constraints and “minimal” constraint qualifications 97 

S. Kusuoka 

Monte Carlo method for pricing of Bermuda type 
derivatives 153 



Historical Perspective 
I. Mutoh 

Mathematical economics in Vienna between the 
Wars 167 



Subject Index 



197 




Adv. Math. Econ. 5, 1-21 (2003) 



Advances in 



MATHEMATICAL 

ECONOMICS 



©Springer- Verlag 2003 



Duality and existence for a class of mass 
transportation problems and economic 
applications 

Guillaume Carlier 

Universite Bordeaux 1, MAB, UMR CNRS 5466 and Universite Bordeaux IV, 
GRAPE, UMR CNRS 5113, Avenue Leon Duguit, 33608, Pessac, FRANCE 
(e-mail : Guill aume .C arlier @ math .u-bordeaux .fr) 

Received: April 15, 2002 
Revised: May 20, 2002 

JEL classification: C61, C82 

Mathematics Subject Classification (2000): 90C08, 90C46, 91B40 

Abstract We establish duality, existence and uniqueness results for a class of mass 
transportations problems. We extend a technique of W. Gangbo [9] using the Euler 
Equation of the dual problem. This is done by introducing the h-Fenchel Transform and 
using its basic properties. The cost functions we consider satisfy a generalization of the 
so-called Spence-Mirrlees condition which is well-known by economists in dimension 
1 . We therefore end this article by a somehow unexpected application to the economic 
theory of incentives. 

Key words: mass transportation, duality, general Fenchel transform, economic theory 
of incentives, Spence-Mirrlees condition 



1. Introduction and main statement 

1.1 Assumptions and notations 

Let us first recall that, given a probability space a measurable 

space ^ 2 ) and a measurable map f : ft i — > the push-forward of pi 

through /, denoted f^pi is the probability measure on ^ 2 ) defined by: 



for every B e A 2 - 
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In all the following, is some bounded connected open subset of and 
fi is some probability measure in Q which is absolutely continuous with re- 
spect to the n-dimensional Lebesgue measure, with a positive Radon-Nikodym 
derivative with respect to the n-dimensional Lebesgue measure and such that 
fi{dn) = 0. 

We are also given a compact Polish space Y, a Radon probability measure 
1/ on Y and a function h:Q xY which satisfies: 

h£C°{IixY,M.), ( 1 ) 

for every u> CCO, there exists c{u)) > 0 such that for all (a;i, 0:2) G 

sup \h{xi,y) - h{x2,y)\ < c(w)||xi -a;2||, (2) 

y€Y 

for all y eY, h{., y) is differentiable in Q. and for all (yi,y2,x) € 1"^ x Q 
dh, . dh . . 

■^{x,yi) = —[x,y2) ^ yi y2- ( 3 ) 

Assumption ( 3 ) plays an important role in the proofs and we shall see that 
it may be interpreted as a generalization of the well-known one of Spence and 
Mirrlees, this assumption was first introduced by Levin in [ 13 ]. 

Our aim is to study the following Monge’s mass transportation problem: 

[M) sup J{s):— / h{x,s{x))d^{x) 

sGA(/x,i^) Jq 

with: 

A(/x, I/) := {s is a Borel map ^Y s.t. 5jt/i == u}. 

The associated Monge-Kantorovich problem is the linear (relaxation of 
(M)) program: 

(MIC) sup K{'y):= f h{x,y)d-y{x,y) 

with: 

r(/x, v) {7 is a Borel probabiliy measure on s.t. 7Tijt7 — 
where 7 Ti(x, y) = x, 7T2 (x, y) = y for all (x, y) xY. 

Finally, we define the (dual of {M)) problem: 

(V) inf L(i/;, 0 ) := [ -h [ (pdu 
{'4),(f))eEh Jq Jy 

with: 

Eh •= real-valued measurable s.t. ' 0 (x)+ 0 (y) > h{x, y), V(x, y) G 
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1.2 Main result 

If is a given real-valued function defined on we define the /i-Fenchel 
Transform of by: 

^^( 2 /) •= sup h{x, y) - '0(x), for all y eY. 

In a similar way, if 0 is a given real-valued function defined on Y, we define 
the /i-Fenchel Transform of (/>, 0^ by: 

(/)^(x) := sup h{x, y) - <p{y), for all x eVt. 

yev 

Our main result can then be stated as follows: 

Theorem 1 Under assumptions (1), (2), (3) the following assertions hold: 

1) problems (Ai), (Ai/C) and (V) admit at least one solution, 

2) {V) is dual to {M) and (MIC) in the sense: 

inf(P) = sup(A^) = sup(>1/C), 

3) the minimum in {V) is attained by a pair ( 7 /^, 0) such that: 

0 = 0 ^, (f) = 

there exists moreover some Bore I map s from Q to Y which satisfies: 

ilj{x) -f (j){s{x)) = h{x^s{x)), for all x G 

s e A(/i, u) and is a solution of{M), and {id^ s)^p is a solution of{Ai)C), 

4) uniqueness also holds: if s is a solution of (Ai) then s = s p-a.e,, 
{id,s)^p is the unique solution of (AiJC), and if ( 0 , 0 ) is a solution of (V) 
then 'ip — (respectively f — f) is equal to some constant pi-a.e. (respectively 
v-a.e.). 

In Section 2, technical lemmas are established and basic properties of the 
/i-Fenchel transform are proved. In Section 3, the main result is proved. Finally, 
in Section 4, we adress a question arising in the economic theory of incentives 
and show how assumption (3) can be interpreted as a natural generalization of 
the Spence-Mirrlees condition. In this framework, our main result enables to 
prove a general re-allocation principle. 

The problem of optimal measure preserving maps (Ai) has received a lot 
of attention since related questions naturally arise in fluid mechanics [2], dif- 
ferential geometry (see [16] for relation with a classical result of Aleksandrov 
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[1]), shape optimization [4] , functional analysis [1 1], [12], probability [19] and 
economics. In the case F c and h{x, y) = x-y, the problem was solved by 
Brenier [3] who proved the important Polar Factorization Theorem and exis- 
tence and uniqueness of an optimal map which is the gradient of some convex 
potential. This result was then extended by Me Cann and Gangbo [10] for costs 
of the form c{x — y) with c strictly convex. The result stated in Theorem 1, is 
very much in that spirit since it expresses existence and uniqueness of an opti- 
mal allocation map which is a measurable selection of the /i- subdifferential of 
some /i-convex potential. Similar characterization results were obtained by V. 
Levin [13] using a different approach based on cyclical monotonicity and the 
relaxed problem (MfC). 



2. Technical preliminaries and /i-Fenchel transform 

In what follows 'ip will always denote some function : — > R U {+oc} and 0 

some function : F ^ R U {+oo}. 

Definition 1 1) 'ip is h-convex if and only if there exists a nonempty subset A 
ofY xR such that: 

'ip{x) = sup h{x,y) + t,for all X e fl. 

(y,t)eA 

2) (pis h-convex if and only if there exists a nonempty subset B of Q x R such 
that: 

(p{y) = sup h{x, y) -h t, for all y EY. 

Remark. If xp is /i-convex then either 2 p is identically -f-(X) or it is bounded. 
Note also that finite ^-convex potentials are l.s.c, hence i/-measurable. 

Definition 2 1) The h-Fenchel Transform of 'ip, 'ip^, is the h-convex function 
defined by: 

:= sup/i(a;,y) - for ally £ Y. 

2) The h-Fenchel Transform of cp, (p^ is the h-convex function defined by: 
(p^{x) := sup h{x, y) - (p{y), for all x eQ. 

yeY 

Obviously, Young’s inequalities hold: 

'ip(x) + ^^{y) > h{x, y), for all (x, y) G x F (4) 



and: 



(p^{x) -h 0(y) > /i(x, y), for all (x, y) eQ xY. 



(5) 
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Proposition 1 

= sup{/(x) : f < 'Ip, f is h-convex }, for all x e 

{(p^)^{y) = sup{^(x) : g < (p, f is h-convex }, for all y eY. 

It follows that Ip ( respectively (p) is h-convex ( respectively h-convex) if and only 
if'ip = (respectively (p = 

Proof. 



First is /i-convex and Young’s inequality yields < ip so that, 

if we define: 

V (x) sup{/(x) : / < ' 0 , / is /i-convex }, for all x eft, (6) 

then : 

<V < Ip. (7) 

Since V is /i-convex, there exists a nonempty subset ^ of Y x M such that: 

V (x) = sup h(x, y) + t, for all x eft. 

Let {yo, to) e A and ^ := h{., yo) + to we have: 

iP>^=> {tp'^f > ( 8 ) 

of course ^ and since then {C^)^{x) > h{x, yo) + fo = 

^{x) for all X so with ( 8 ) we get (ip^)^ > ^ and since {yo,to) 

is arbitrary in A taking the supremum yields {ip^)^ > F so that V = 
using (7). The characterization of (</>^)^ is proved in the same way. 



□ 



Definition 3 1) Define, for all x eft: 

d^'ip(x) := {y eY : 'ip{x') — pj{x) > h{x' ,y) - h{x,y), for all x' e 11} 

d^ip{x) is called the h- subdifferential of'ip at x, and xp is h-subdifferentiable 
at X if and only ifd^'ip(x) ^ 0. 

2) Define, for all y e Y: 

d^(p{y) := {x e n .• <p{y') - (p{y) > h[x,y') - h{x,y), for ally' 6 7}. 
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Note that d^'ip{x) and d^(/){y) can also be defined by: 

d^ip{x) = {y£Y ■. -ip{x) + ip^{y) = h{x,y)} 

and: 

d'^Hy) == {a; e n : 4>^{x) + <p{y) = h{x, y)}. 

In particular, if 'ip is /i-convex and x e Q, then y E d^\p{x) if only if 

X E d^'ip^{y). 

Proposition 2 Let xp be h-convex and finite, the following assertions hold true: 

1) For all X E 0, d^xp{x) is nonempty and compact, and the restriction of 
the set-valued map d^xp to every closed subset of ft has a closed graph, 

2) xp e and: esssupuj\Vxp\ < c{u) for all uo CC flandc{uj) is 

given by (2), 

3) ifxp is differentiable at x ^ ft and y E d^xp{x) then 

VV’(a:) = 

4) there exists some Borel map such that for almost every x ^ ft, 
d^xp{x) = {s^{x)} and s^{x) E d^xp{x) for all x E ft. 

Proof. 

Let A be some nonempty subset of F x R such that: 

xp{x) = sup /i(x, y) + t, for all x e ft. 

(y,t)eA 

1) Fix X eft and let tn) be some sequence of A such that h{x, yn) + tn — > 
xp{x) SiS n — > +CXD. Up to a subsequence we may assume that yn converges 
to some y e Y so that tn t := xp{x) — h{x,y). Let us show now that 

y E d^xp{x). Let x' e ft for all n, xp(x') > h{x' , yn) + fn passing to the limit 
we get xp{x') > xp{x) + h{x', y) - h{x, y) i.e. y E d^xp{x); d^xp{x) is clearly 
compact since Y is and h is continuous. The fact that the restriction of d^xp to 
every closed subset of ft has a closed graph is straightforward. 

2) Let uj ceft and 

c(o;) := sup \h{xi,y) - h{x 2 ,y)\ ■ \\xi - X 2 \\~^ < +QO. 

{xi,X2,y)£uj'^ xY, xi^X2 



Let (xi, X 2 ) E we have: 

xp{xi)= sup h{xi,y)-{-t= sup h{x 2 ,y) 1 + h{xi,y) - h{x 2 ,y) 

(y,t)eA {y,t)eA 
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< ^(X2) +C(w)||xi -X 2 II 

finally, reversing order of x\ and X 2 yields the desired result. 

3) Let X e ft hQ 3. point of differentiabilty of and y G d^'ip(x), let 

k G and t 0 be such that [x - t/c, x + tk] C 

'ip{x + tk) — ^{x) = tV'ip(x) • /c + o{t) > h{x + tk, y) — h{x, y) 

dh , . , . , 

= t — {x,y) ‘k^o[t) 



dividing by f > 0 and letting t ^ 0+ in the previous yields 



dh 

VV’(x) • k > ■ k 



similarly the converse inequality also holds taking t 0 and since k is 
arbitrary we get: 



V^A(x) = ^{x,y). 



4) By 2) and Rademacher’s Theorem t/; is differentiable a.e. in Q. On the 
other, hand since Y is compact and separable and using 1), d^^ip admits a mea- 
surable selection say (see [6] or the measurable selection Theorem of Brown 
and Purves in [25]). If is differentiable at x G and y G d^'ip{x) then by 3) 
we get: 






SO that with (3) 9^'0(x) = {s^{x)}. 



□ 



Corollary 1 Let ipi and t /^2 be h-convex and finite, if for fi—a.e. x £ Q 

d^iJi{x) n d^'ilj2{x) ^ 0 



then 'ipi — ip 2 is constant. 

Proof. 

Using Proposition 2, we get - ^ 2 ) == 0 a.e. in 17 hence the desired 
result, since 17 is connected. r— 1 
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We end this section by a result which will play a crucial role in the proof 
of the main result. The next Proposition is actually a straightforward general- 
ization of a result of Gangbo [9] which was an important tool in [9] to prove 
Brenier’s Theorem. 

Proposition 3 Let (j) be h-convex and finite, let f G C^{Y, R), define: 

ipo ■= 



and for all r G (—1, 1) 

V'r := {(f> + rf)^ 

then also define A := {x G ft : 'i/jq Is differentiable at x} and s := as in 
Proposition 2, 4) then, for all x G A: 

lim -[iAr(a:;) - i^o{x)] = -f{s{x)). (9) 

r— >0 r 

Since p{Q \ A) = 0, f9j is satisfied a.e. in Q. 

Proof. 

Let X G A, first we have: 

^o{x) = h{x, s(x)) - (t){s{x)) (10) 

And, for all r G (-1, 1): 

^frix) = h{x,yr) - (t){yr) - rf{yr) for all yr e d^ 2 pr{x). (11) 

Let rn be some sequence of (-1,1) \ {0} which converges to 0 and relabel 
some sequence yr^ G yn. 

Step 1. 

Let us show first that yn s(x) as n — > H-cx). 

Up to a subsequence we may assume that yn converges toy eY. First note 
that: 

ll^rn - ^olU < rnWfWoo 0 (12) 

and: 

'ifrAx) = h{x,yn) - (fiyn) ~ rnfiVn)- (13) 

Since 0 is l.s.c., we get: 

\Mn4>{yn) > 0(y) 

so that passing to the limit in (13) i>Q{x) < h{x, y) - (j){y) — h{x, y) - 
and then y G d^ifo{x) = {^(x)}, s{x) is therefore the only cluster point of yn 
so that the whole sequence converges to s{x). 




Duality and existence for a class of mass transportation problems 



9 



Step 2. 

First, we have: 

— \'>Pr^{x)-'tpo{x)] = —[{h{x,yn)-(i){yn))-{h{x,s{x))-4>{s{x))]-f{yn). 

n 

(14) 

On the one hand: 

h{x, yn) - 4){yn) < h{x, s{x)) - (f>{s{x)) (15) 



on the other hand: 

h{x, yn) - 4>{yn) > h{x, s{x)) - (p{s{x)) + r„[/(y„) - /(s(a:))] (16) 

using (15), (16) and the fact that yn converges to s{x) and passing to the limit 
in (14) we obtain: 



\\m—[ipr„{x)-4’o{x)] = -f{s{x)) (17) 

" Xn 

since (17) holds for any sequence (r„) € ((—1, 1) \ {0})^ that converges to 0 
we finally get: 

\im -[ipr{x) - ipo{x)] = -f{s{x)). (18) 

r-^0 r 

□ 



3. Proof of the main statement 

We are now ready to prove Theorem 1 . First note that one obviously has: 

sup(Af/C) > sup(7V4). (19) 

Let {tp, (p) € Eh and 7 e F(^, i/), one has: 

L{^,(p)=[ i'tpix) + (f){y))d'y{x,y) 

JnxY 

> / h{x,y)dj{x,y) = K{'y) 

Jqxy 



so that: 



inf (27) > sup(A4/C) 



( 20 ) 
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Remark. If (^, 0) G Eh and s G A(/x, i^) (respectively 7 G r(/i, i/)) are such 
that J{s) = L{xIj, <P) (respectively K{'y) = L{'ip^ 0)) then (0, 0) is a solution 
of (P) and 5 is a solution of [M) (respectively 7 is a solution of {MK)). 

The first step of the proof is: 

Lemma 1 There exists a solution (0, 0) of {V), moreover if {'ll;, 0) is a solu- 
tion of(V) then 0 is h-convex, 0 is h-convex and those functions are conjugate 
to each other: cj) = v-a.e. and 'll; = fi-a.e.. 

Proof. 

Note first that it is clear from (20) that the value of {V) is finite. 

Step 3. 

We first prove that if (0, 0) G is a solution of {V) then: 

> </. 7 ) = > {cp^f}) = 0 ( 21 ) 

If (0, 0) G Eh then obviously \p > and 0 > 0^. Let 0 := 0^ and 0 := 
0 ^ = (0^)^, by Young’s inequality (0, 0) G Eh and 0 < 0 and 0 < 0 so that 
L{'ip, 0) > L(0, 0). Hence if (0, 0) G is a solution of {V) then: 

K{<A>(0Y}) = o 

and 

/ i ({0 > 0 '^}) = 0 

this also implies /Li ({0 > ( 0 ^)^}) = 0 since ( 0 ^)^ > 0 ^ and ( 21 ) is proved. 

Step 4. 

We now prove existence. Let (0n, (pn) ^ E^ be some minimizing sequence 
of (T>), noting that 1 /( 0 ^ + a, 0 ^ — a) = L( 0 ^, (pn) and using ( 21 ), we may 
assume with no loss of generality that ipn = (p^^ (pn = 0n and: 

inf 0 n = 0 ( 22 ) 

also note that the infimum in ( 22 ) is attained since (pn is l.s.c. say at some point 
Zn- Since 0n > 0 we get first: 



0n < max h 
QxY 



0n > Zn) > min h 
QxY 



and since (pn{zn) = 0: 
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so that V^n(^) is bounded uniformly in n and x e On the other hand, using 
the fact that xpn is /i-convex and Proposition 2, assertion 2), we get that 'ipn 
is locally Lipschitz uniformly in n. Using Ascoli’s Theorem, we may assume, 
up to a subsequence that converges uniformly on compact subsets of Q to 
some bounded and locally Lipschitz function 

Step 5. 

Let us prove that ^ is itself h-convex. Define, for all x G 0: 

'ip(x) := sup '0(x') + 

y'eF{x') 

with: 

F{x') := n U 9^'^nix') 

N>1 n>N 

note that F{x') 0 for d^pJnix') is nonempty and compact for all n. 

ip is clearly /i-convex and pj > \p\ti us show the converse inequality: let 
x' eO. and y' ^ limAr where un +oo and yn^ G d^^uN (^0’ passing 
to the limit in: 



i>nM (*) > V’n« {x') + h{x, yn„ ) ~ h{x' , ) 



we get: 

pj^x) > pj{x') + h{x, y') — h{x', y') 

taking the supremum in the previous finally proves ip = ^p so that ip is h- 
convex. 

Step 6. 

Let (p := ip^ (so that {ip, (p) G Eh) and let us prove that {ip, (p) is a solution 
of {V). Lebesgue’s dominated convergence Theorem yields first: 

/ ipudfi -> / ipdfi. (23) 

Jn Jn 

Now since, for all {x, y) G D x F, (pn{y) > h{x, y) — ipn{^) we get: 

limn^n (24) 

using (24) and Patou’s Lemma we get: 

(25) 

By (23) and (25) we deduce that {ip, (p) is a solution of {V) with ip = (p^ since 
(p = ip and ip is /i-convex. 




□ 
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The precise duality relations between {V) and [M), (MIC) are given by: 

Lemma 2 Let {'ip, cp) be as in the previous Lemma and s be any Borel selection 
of d^'ip , the following assertions hold: 

1) se u) and it is a solution of{M), 

2) 7 := (id, G r(/i, u) and it is a solution of {MIC) 

3) (V) is dual to {M) and {MIC) in the sense: 



V := inf(D) = sup(Ad) = sup(A^/C). 

Proof. Since (0^,0) G Eh for all (p, (p minimizes (p i-^ L{(p) := L{(p^^(p) 
say for instance in L^(y, 

In particular for all / G M): 

lim -\L{(p + rf) - L{(p)] > 0 (26) 

r->o+ r 

-[L($ + rf)-L{^)]= f fdu+ / -[(0 + r/)'* 

^ Jy JQ 

Proposition 3, yields first: 

lim-[($+rf)^{x) -0^(x)] = -/(s(a;)), ^u-a.e. (27) 

r 

on the other hand: 

\i[0 + rff-t]\<\\f\\^ (28) 

(26), (27), (28) and Lebesgue’s Dominated Convergence Theorem yield then: 

f fdi^ - f fis{x))d^{x) > 0 (29) 

Jy Ja 



and the converse inequality obviously holds changing / into — /. To sum up, 
we have proved: 

f fdi^= f f{s{x))dii{x) (30) 

JY Jn 



for all / G C°(F, M) so that u = In other words s G A(/i, u) and 7 := 
(id,s)tt/i G r(/i, z/). Now note that since 'ip{x) + (p{s{x)) = h{x,s{x)) and 
using s(t/i = uv/Q have: 



L(V^,0) = inf(D) = / [ip{x) + (p{s{x))]dp{x) 
Jn 
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= [ h{x,s{x))dfi{x) = J{s) = K{'y) 

Jn 

Finally, using (20) we get: 

J(s) = sup(A^) = K{^) = sup(A^/C) = inf(D) (31) 



which proves that 5 is a solution of(Af) and 7 is a solution of (MIC), hence 
the desired result. 



□ 

The last thing to prove is uniqueness: 

Lemma 3 Let (^, 0), s and 7 be as in the previous Lemma, the following 
assertions hold: 

1) if is a solution of (V) then there exists a constant a such that: 

= a, p-a.e., 
f f = —a, v-a.e.^ 

2) if s is a solution of{A4) then s = s p-a.e., 

5) 7 := (id, 5)tt/i is the unique solution of {M.IC). 

Proof. 

1) Assume that f) is a solution of (V) then, using Lemma 1, we may 
assume that 

ip — ([f and (j) — 

let s be some Borel selection of d^'ip. We already know that 5 is a solution of 
[M) by Lemma 2. Young’s inequality yields: 

'ip{x) + (p[s{x)) > h{x, s(x)), for all x e Lt (32) 

using J{s) = L{xp, p) and the fact that s G A(/i, u) we get: 

L{p,p)= f [ip{x) p{s{x))]dp{x) = f h{x,s{x))dp{x) (33) 
Jn Jn 

(33) and (32) yield: 

p{x) + p{s{x)) = d(x, s{x)), for /i-almost every x eft 
or equivalently s{x) G d^'ip(x) a.e.. 
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Finally, Corollary 1 implies that there exists some constant a 
'll; — 'll; = fi-a.t. and (p — cp = —a z/-a.e. 

2 ) Similarly, if 5 is a solution of (M) then 

'ip(x) + (l){s{x)) = h{x^ s{x))^ for /x-almost every x eft 

and then s = s //-a.e. 

3) Let 7 be a solution of (MIC) so that: 

K{'y)=f hd'r = L{ip,'$) = f [ij{x) +'${y)]d'y{x,y) (34) 

JQxY JQxY 

since {'ip, (p) e Eh we get: 

h{x, y) = xp^x) + 0(y), 7 -a.e. (35) 

Let Gips) be the graph of 5 and G(d^'ip) be that of d^xp, (35) implies: 

7 (G(s)) = -i{G{d^'ip)) = 1. (36) 

Let A be some Borel subset of ft and B be some Borel subset of Y, by (36), 
we get: 

jiA xB)=j{AxBD G{s)) 

using (36) once again, we get then: 

7(A X 5) =7((^ns“^(5)) X Y) 

and since 7 TijJ 7 = //: 

7 (^ xB) = ii{Ar]s-\B)) = ^{A X B) 

so that 7 = 7 which ends the proof. 



n 

We end this section by a Polar-Factorization type consequence of the main 
result: 

Corollary 2 Up to p-a.e. equivalence there exists a unique Borel map s such 
that: 

1) there exists some h-convex potential 'ip such that s(x) e d^'ip{x) for all 
X eft, 



2) s pushes forward p through v. 
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Proof, s defined as previously, satifies the desired result, now, if s satisfies 1) 
and 2) then 

J{s) = > inf(D) = sup(AI) 

so that, using Lemma 3, s = s /x-a.e. 



□ 



4. Economic application and generalized Spence-Mirrlees 
condition 

We end this article by an application of our result to the theory of incentives. 
More precisely, we are going to prove a re-allocation principle that generalizes 
a well-known one in dimension 1 . 

Assume that agents’ preferences are given by the quasi-linear utility func- 
tion: 

V(x,y,t) = h{x,y) + t, 

where x ^ Q is the agent’s type or parameter, y e Y is an action and f G M 
is some monetary compensatory transfer. We make the same assumptions on 
f], y, h and /i as previously. Note that in this case, the probability measure fi 
captures the distribution of types among agents. 

A key concept in that theory is that of incentive-compatible contracts: 

Definition 4 7 j A contract is a pair of functions {s,t) : Y x R. 

2) The potential associated with a contract ( 5 , t) is the function Vs t defined 
by: 



Vs^t{x) := h{x, s{x)) H- t{x)for all x e fl. 

3) The contract (s, t) is incentive-compatible if and only if: 



/i(x, s{x)) -h t{x) > h{x, s{x')) -f- t{x'), for all{x, x') G (37) 

4) A function s : Q ^ Y is implementable if and only if there exists t : 
^ R such that {s,t) is incentive-compatible. 

Remark. The incentive-compatibility condition (37) means that it is optimal 
for every agent to announce his true parameter. 
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4.1 The usual Spence-Mirrlees condition and the one dimensional case 

In the special one-dimensional case i.e. Q = (a, 6), Y = [a, (3] and under the 
assumption that h is of class and satisfies the Spence-Mirrlees condition: 



d‘^h 

dxdy 



>0 



(38) 



then we have the standard characterization result (see [17], [24], [21]): 



Proposition 4 s is implementable if and only if s is nondecreasing. 

Remark. Note that Spence-Mirrlees condition (38) implies our assumption (3) 
on h. 

In this one-dimensional case and under assumption (38), we also have the 
other characterization: 

Lemma 4 The following assertions are equivalent: 

1) s is nondecreasing 

2) there exists iIj h-convex such that s{x) G d^'il){x)for all x ^ Vi. 

Proof. First assume that s is nondecreasing and define: 

./X r dh, , ,, , 
tp{x):=J —{t,s{t))dt 

we are going to prove that ip is h-convex and s(x) e d^ip{x) for all x. Define 
for all x: _ 

'^(x) := sup '0(x') + h{x, 5(x')) — /i(x', s(x')) 

x' £Cl 

is /i-convex and > ^, let us show that Let x' G H, we have: 

'0(x) - '0(x') - /i(x, s(x')) + h{x', s{x')) 

.dh , . dh , , ... . 

and the latter is nonnegative using the fact that s is nondecreasing and (38). 
This yields ^ ^ so that 'll; is /i-convex and the previous computation also 

yields s(x) G d^^{x). 

Conversely assume that for all x, s(x) G d^'ip(x) for some /i-convex 'ip. We 
have: 

'0(x') — i^{x) > h{x'^ s{x)) — /i(x, s(x)) 
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and 

'ip(x) — > h{x, s{x')) — h{x\ s{x')) 

so that: 

h{x' , s{x)) — h{x, s{x)) + h{x, s{x')) — h{x' , s{x')) < 0 

or equivalently: 

/ ~ ® 

note finally that, with (38), the previous expression has the sign of (x' — 
x){s{x) - s{x')) so that 5 is nondecreasing. 



□ 

The previous characterizations can be viewed equivalently as a re-allocation 
principle via monotone rearrangements: 

Proposition 5 Let sq be some Borel map : Q. Y and let s be the non de- 
creasing rearrangement of sq with respect to p then s is the only Borel map 
which satisfies: 

1) s is implementable, 

2) s and sq are equimeasurable Le. .* = sott/x. 

For properties of monotone rearrangements see [18]. Recall that s is defined 
as follows; first define: 

Fsoiy) '■= m({so < y}), for all y G Y 



then, for all x G fi: 



s(x) := inf{y G Y s.t. > x}. 

Remark, s is the solution of: 

sup / h{x, s{x))dp{x) 
sG A(/Li,sotlAf) J ^ 

In other words, s maximizes the average surplus in the class of maps that have 
the same cumulative function as sq. 

The previous remark can be viewed as an easy consequence of Hardy- 
Littlewood inequality (see [18]). 
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Let us finally note that if h and g both satisfy (38) then the associated 
Monge’s problems have the same solution. Of course, this result is very specific 
of the one-dimensional problem. 

Proposition 6 Let h and g be two functions from Q xY to M. of class that 
both satisfy (38); let u be some Radon probability measure on Y. Then both 
problems: 

sup Jh{s) := / h(^x,s{x))dp{x) 

sGA(/z,i>') Jci 

and 

sup Jg{s) := / g{x,s{x))diJ.{x) 
sGA(/x,i/) Jq 

have the same solution. 

Proof. Let s be the maximizer of Jh over A(/i, u), from Lemma 4 and The- 
orem I, s is nondecreasing. There exists then ^ ^-convex such that s{x) G 
d^^p{x) for all x and since s^p = ly, s maximizes Jg over A(/z, o). q 



4.2 Re-allocation principle in the general case 

Our aim now is to consider the general problem where is a bounded con- 
nected open subset of and p is some probability measure in which is 
absolutely continuous with a positive Radon-Nikodym derivative with respect 
to the n-dimensional Lebesgue measure, and such that p{dfl) = 0, F is a 
compact Polish space and h satisfies (1), (2) and (3). 

We shall prove a similar re-allocation principle as in the one dimensional 
case so that (3) is a natural generalization of the Spence-Mirrlees condition. 
A first attempt was made by Me Afee and Me Millan in [15] to characterize 
incentive-compatibility in a multi-dimensional setting, the condition of these 
authors is much stronger than (3) and their characterization requires s to be 
differentiable which is of course not required in what follows since Y need not 
have a linear structure. 

We start by a characterization result that can be found in [5] 

Proposition 7 Let s .* f] ^ F and t : Q then we have: 

1) ( 5 , t) is incentive -compatible if and only if 

Vs^t is h-convex, and 
s{x) G d^Vs^t{^) for all X G ft. 
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2) s is implementable if and only if there exists some h-convex function 
such that: 

s(x) e for all X G 

Then the re-allocation principle can be stated as: 

Theorem 2 Let sq be an arbitrary Borel function ft there exists a unique 
(up to a.e. equivalence) Borel map s such that: 

1) s is implementable, 

2) So and s are equimeasurable i.e. 5(j/i = 

Moreover, s is the solution of the Monge’s Problem: 

sup / h{x^s{x))dgi(x). 

Proof. Proof follows directly from Theorem 1 and Proposition 7 



(39) 

□ 



Remark. The economic interpretation of this result is the following : any 
allocation plan can be rearranged into some implementable one in a unique 
way ; s is therefore in some sense a monotone rearrangement of sq and it is 
obtained by maximizing the average surplus in the set of measure-preserving 
maps A{n, soHm)- 

Moreover, at least from a theoretical point of view, one can use our main 
result to find a tarif t such that the pair (s, t) is incentive compatible. Let (f), f) 
be a solution of the dual problem of (39), and define for all x G 

t(x) := —(j){s{x)) = —h{x,s{x)) + 'ip(x), 

then it can be checked easily that the pair (s,7) is an incentive-compatible 
contract, let us indeed consider a pair of types (x, x'), we have: 

/i(x, s{x)) + i{x) = ip{x) — (j)^ (x) > h{x, s{x')) - 0(s(x')) 



= /i(x, s{x')) + t(x'). 
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Abstract. Let r > 0 be a finite delay and Co = C£;([— r, 0]) be the Banach space 
of continuous vector- valued functions defined on [— r, 0] taking values in a separable 
reflexive Banach space E such that its strong dual E' is uniformly convex. We dis- 
cuss the existence of strong solutions for an evolution equation of the form ii{t) G 
—A{t)u(t) -f F(f, r{t)u) a.e in [0, 1], where u : [-r, 1] — > is a continuous mapping 

from [— r, 1] into E such that its restriction to [— r, 0] is equal to (/p G Co, and its restric- 
tion to [0, 1] is absolutely continuous, (i.e u{t) = u{0) -h /q u{s) ds.Wt G [0, 1] with 
u G F^([0, 1]), and satisfies the preceding inclusion, A{t) is an m-accretive multival- 
ued operator on E,E : [0,1] x Co —^E isa. convex weakly compact valued, separately 
scalarly measurable and separately scalarly upper semicontinuous multifunction and 
{r{t)u){s) = u{t + s),V5 G [— r, 0]. Some applications to the sweeping process (or 
Moreau’s process) and Optimal Control involving Young measures are also presented. 

Key words: accretive operator, ball-compact, functional differential inclusion, max- 
imal monotone operator, multifunction, original control, relaxed control, relaxation. 
Young measure 



Introduction 

Let r > 0 be a finite delay, E he a reflexive separable Banach space such 
that its strong dual E' is uniformly convex, A{t), {t G [0, 1]) be an m-accretive 
multivalued operator on E, Co = ( [-r, 0] ) be the Banach space of all contin- 

uous £- valued functions defined on [— r, 0] equipped with the norm of uniform 
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convergence and F : [0, 1] x C£;([— r, 0]) ^ E" be a separately scalarly mea- 
surable and separately scalarly upper semicontinuous convex weakly compact 
valued multifunction. For any t G [0, 1], let r{t) : CE{[-r,t]) Cq defined 
by {r{t)u){s) = u{t + s),Vs G [-r, 0] and \/u G CE{[-r,t]). Let (/? be a 
given element of Cq. This paper concerns the existence of strong solutions for 
an evolution equation governed by m-accretive operators of the type 

(P ) ^ -A{t)u{t) -h F{t,T{t)u), a.e t G [0, 1], 

yu{s) = (p(s), Vs G [-r,0]; u{t) G D{A{t)), Vt G [0, 1]. 

By a strong solution of {Pr) we mean a function u : [— r, 1] ^ E such that 
its restriction to [— r, 0] is equal to (p and its restriction to [0, 1] is absolutely 
continuous, i.e u{t) = u(0) + u(s) ds,Vt G [0, 1] with ii G E^([0, 1]) and 
satisfies (Er). 

In section 2 we state the existence of strong solutions for the problem 
{Pr) and its consequences via a new discretization technique. As an illus- 
tration of this technique, we present also a viable strong solution to an evo- 
lution equation governed by an m-accretive operator A{t) of the form ii{t) G 
—A{t)u{t) a.e t G [0 , 1]; u{0) = a e C(0)flE, u{t) G C{t)r\D^ Vt G [0, 1], 
here D = D{A{t)), for all t G [0,1], and C : [0, 1] ^ E is a closed 
ball-compact valued (that is the intersection of C{t) with any closed ball of 
E is compact), upper semicontinuous multifunction, satisfying a viablity type 
condition, namely [Ie + XA{t)]~^C{s) C C{t), for all A > 0 and for all 
0 < s < t < 1. Applications to sweeping process [7, 8, 20, 21, 26, 27] and 
Optimal Control problems for the equations under consideration with the help 
of Young measures (see e.g [28]) are given in section 3. The present work is 
essentially a continuation of the ones given in [4, 5, 7, 8, 13, 15, 25, 26]. For 
more on evolution problems with delay in Banach spaces and related results, 
see [18, 23, 24, 31] and the references therein. 

1. Notations and preliminaries 

We will use the following definitions and notations and summarize some basic 
results. 

- E is a separable Banach space, is the identity mapping on E and E' 
is the topological dual of E. 

- Be is the closed unit ball of E. 

- cc{E) (resp. cwk{E)) is the collection of nonempty closed convex (resp. 
weakly compact convex) subsets of E. 

- If A is a subset of E, ^*(., A) is the support function of A. 
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£([0, 1]) is the cr-algebra of Lebesgue measurable subsets of [0, 1]. 

If X is a topological space, B{X) is the Borel tribe of X. 

If A and B are two subsets of E, the excess of A over B is 

e{A, B) = sup{d(a, B) \ a e A} 
and its Hausdorff distance is 

h{A^ B) = sup{e(yl, B), e{B, A)}. 

The excess e(A,{0}) is denoted by \ A\ where \ A\ = sup{||a|| : a G A}. 

Z/^([0, 1], dt) (shortly L^([0, 1])) is the Banach space of Lebesgue- 
Bochner integrable functions / : [0, 1] E. 

A mapping u : [0, 1] E is absolutely continuous if there is a function 
ii G ^^([0, 1]) such that u{t) = i^(0) + /q u{s) ds, Vf G [0, 1]. 

If X is a topological space, Ce{X) is the space of continuous map- 
pings u : X E equipped with the norm of uniform convergence: 

I|w||ce(X) =sup^€X l|w(a;)||- 

A multifunction F : [0, 1] ^ B{E) is measurable if its graph belongs 
to >C([0, 1]) (g) B{E). A convex weakly compact valued multifunction 
F : X — > cwk{E) defined a topological space X is scalarly upper semi- 
continuous if for every x' G F', the scalar function S*{x'^ ^( )) is upper 
semicontinuous on X. 

A multivalued operator A{t) : F — > 2^, (t G [0, 1]) is m-accretive, if, 
for each t G [0, 1] and each A > 0, R{Ie + — F, and for each 

xi G D{A{t)),X 2 G D{A{t)),yi G A{t)xi,y 2 G A{t)x 2 , we have 

U) Iki - a;2|| < ||(a;i - X2) + \{yi - y2)\\ 

where D{A{t)) := {x e E : A{t)x ^ 0}. 

If A(t) is m-accretive, then 

('.'.•'1 “ ^11 = pA(i)a:|| < \A{t)x\o ;= inf ||y||, 

[JJ) A y€A{t)x 

Vx G D{A{t)), 

where J\A{t)x = {lE-\-XA{t))~^x, We refer to [1, 3, 31] for the theory 
of accretive operators and equations of evolution in Banach spaces. We 
refer to [9] for Measurable Multifunctions and Convex Analysis. 
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Throughout E is a separable reflexive Banach space such that its strong 
dual is uniformly convex. 

2. Existence results 

We will consider the following assumptions: 

(Hi) There exist a continuous function p : [0^1] E and a nondecreasing 
function L : [0, oc[-^ [0, cxd[ such that 

\\JxA{t)x - JxA{s)x\\ < X\p{t) - p{s)\L{\\x\\) 

for all A g] 0, 1], for all (f, s) e [0, 1] x [0, 1], and for all x e E. 

{H 2 ) For every 5 > 0, there exists (5(s) > 0 such that 

II Ja^(0)x - x|| < A5(s) 

for all A g] 0, 1] and for all x G D{A{0)) with ||x|| < s. 

Remarks. 2. 1 . Assumption {Hi ) is similar to the one employed by [3, 12] in the 
study of quasi-autonomuous evolution equations. By ([12], Lemma 3.1), {Hi) 
implies that the sets D{A{t)) are constant, i.e D{A{t)) := D for all t G [0, 1]. 

2.2. In view of {jj) {H 2 ) is satisfied if, 0 G D{A{0)) = D{A{0)) and A(0) 
satisfies the following boundness type condition, namely, for any closed ball 
5 ( 0 , 77 ) ofcenterO with radius ry, the set { I A(0)x|o • ^ ^ D{A{0))nB{0,r])} 
is bounded in E. In particular, {H 2 ) is satisfied if A(0) : D{A{0)) E is 
cu;/c (5) -valued and scalarly upper semicontinuous, and 5(A(0)) fi 5(0, 77 ) is 
compact for any closed ball 5(0, 77 ) of center 0 with radius 77, because the sets 
{A(0)x : X G D{A{0)) fi 5(0, 77 )} are therefore weakly compact. 

(5s) (a) For every L|;([0, 1]) -mapping 77 : [0, 1] ^ 5 satisfying u{t) G 
D{A{t)) for all t G [0,1], the multifunction t A{t)u{t) is measurable, 

(b) for every x e E and for every A > 0, f 1 -^ {Ie + XA{t))~^x is measurable, 

(c) there exists g G I/|;([0, 1]) such that t {Ie + XA{t))~^g{t) belongs to 
L|([ 0 ,l])forallA> 0 . 

( 54 ) F : [0, 1] X C£;([-r, 0]) cwk{E) is separately scalarly measurable on 
[ 0 , 1 ] and separately scalarly upper semicontinuous on C^;([-r, 0 ]), and there 
exists a convex weakly compact set K such that F{t, u) C K for all (t, u) G 
[ 0 , l]xC^([-r, 0 j). 

The following provides a closure type result in the convergence of approx- 
imated solutions. 

Lemma 2.3. Suppose that A{t) : E 2^{t G [0,1]), is an m-accretive 
operator satisfying (5s), (t/^) and {vn) are sequences in L|;([0, 1]) with 
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Un{t) G D{A{t))for every n and for every t G [0, 1] and (r^) is a uniformly 
bounded sequence of positive measurable functions defined on [0, 1] such that 
0 pointwisely on [0, 1]. Assume that the following conditions are sat- 
isfied: 

(i) (un) converges strongly to u £ L|.([0, 1]) and {vn) converges to v E 
L|;([0, 1]) with respect to the topology 

(ii) Vn{t) G A{t)un{t) -h rn{t)BE for all n and all t G [0, 1]. 

Then we have v{t) G A{t)u{t) a.e t G [0, 1]. 

Proof Let /l|([o,i]) be the identity operator in L|;([0, 1]). Let A be the opera- 
tor in L|([0, 1]) defined by 

V G Au 4 =^ v{t) G A{t)u{t) a.e t G [0, 1]. 

We claim that A is m-accretive in L|.([0, 1]). Let A > 0 and let ^ G L|.([0, 1]). 
By {Hs){c) there exists g G L|.([0, 1]) such that h : t {Ie + XA{t))~^g{t) 
belongs to 1]). Since {Ie + \A{t))~^ is nonexpansive [31], we deduce 

that the function h \ t ^ {Ie + XA{t))~^g{t) is measurable and belongs to 
L|;([0, 1]) thanks to {Hs){b)-{c). Furthermore, we have g e h + XAh 
^ ^ + ^-4) == L|;. Let U be the closed unit ball 

ofL- ([0, 1]). In view of {ii), {Hs){a) and measurable selection theorem, we 
claim that Vn G Aun -f- IZn for all n, where 

IZn {z G Lg^([0, 1]) : z = rnW, w G U}. 

Firstly, it is easy to see that IZn is equal to the set of all measurable selections 
of the measurable multifunction Vnf)BE. Secondly, by (ii) and {Hs){a), the 
nonempty-valued multifunction : [0, 1] E x defined by 

^'n(i) := {(a;, y) e {A{t)un{t), rn{t)BE) :x + y = Vn{t)}, Vt 6 [0, 1], 

is measurable. By measurable selection theorem, there is a measurable selec- 
tion Xn of the multifunction A{.)un{-) and a measurable selection pn of the 
measurable multifunction Vn{.)BE such that Vn{t) = Xn{t) 4- yn{i) for all 
t G [0, 1]. Moreover there is Wn ^ U such that pn = VnWn. So, we have that 
Vn = Xn+TnWn G Aun+IZn for all Ti. As A{t) IS accretive for each t G [0, 1], 
it is easy to check that A is accretive in L|([0, 1]). Since E' is uniformly con- 
vex, the dual L|/([0, 1]) of L|([0, 1]) is uniformly convex, too, see e.g ([32], 
Theorem 4.2 and Remark 4.7). Consequently, by ([31], Theor.1.5.2) the graph 
of A is strongly-weakly sequentially closed. By (i) Un strongly converges to 
u G L|([0, 1]), - rnWn V, Weakly in I/|([0, 1]), (because VnWn 0 

strongly in L|.([0, 1]) and Vn - VnWn G Aun by what has been proved, so we 
conclude that v G Au v{t) G A{t)u{t) a.e t G [0, 1]. □ 
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Remarks. (1) If rn(.) = 0, Vn, Hs{a) can be dropped. In particular, Lemma 
2.3 shows that the m-accretiveness lifts from E to 1]). 

(2) Here is an example of a maximal monotone operator in a separable Hilbert 
space H satisfying {H^). Let / : [0, 1] x R be a Caratheodory function 
such that /(t, .) is convex and such that \f{t,x) - f{t^y)\ < /3{t)\\x - y\\, 
for all (t,x,y) e [0, 1] x x iif with P G 1]). Let dft denotes the 

subdifferential of ft. Then the following hold. For every v e H, and for every 
X e H, the subdifferential dft{x) of ft at the point x is a convex weakly 
compact subset of H and its the support function (5* dft{x)) is equal to 



/.'(x;t.) = inf 



ft{x + Sv) - ftjx) 

6 



Taking the expression of //(x; v) into account, it is easy to check that, if x : 
[0, 1] ^ H and y : [0,1] H are two measurable mappings, then the function 
t I— > //(x(t); y{t)) is measurable. In particular this shows that the multifunc- 
tion t — > A{t)x{i) := dft{x{t)) is scalarly measurable, hence t A{t)x{t) 
is measurable. So A{t) satisfies {Hs){a). Let ^ : [0, 1] — > i/ be a measurable 
mapping. We claim that the single-valued function t {Ih + XA{t))~^g{t) 
is measurable. First suppose that g is continuous and / is globally continuous. 
It is clear that 



{Ih + AA(i)) ^g{t) = {y e H : g{t) Ey + XA{t)y} 

= {y eH : d{g{t) - y, \A{t)y) = 0}. 



Notice that, for every z e H 

d{z, \A{t)y) = sup \{x', z) - 5*{x', XA{t)y)]. 

\\x'\\<l 



Since the functions 



(t,y,z) (x',z) - 5*{x',XA{t)y) 

are lower semicontinuous, because 

6>0 0 

is upper semicontinuous, so is the function 

(t,y,z) d{z,XA{t)y). 

Hence 

{(f, y) e [0, 1] X i? : d{g{t) - y, XA{t)y) = 0} 
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is a Borel set in [0, 1] x H. Thus t ^ JxA{t)g{t) = {In + XA{t))~^g{t) 
is a Borel function on [0, 1] when g is continuous. Consequently t {Ih + 
XA{t))~^g{t) is measurable when g measurable, by applying Lusin’s theorem 
to g and Scorza-Dragoni’s theorem to /. So A{t) satisfies As the 

directional derivative //(x; v) of ft at the point x in the direction v is equal 
to the support function 5*{v,dft{x)) of dft at the point v, using inequality 
\f(t,x) - f{t,y)\ < I3{t)\\x - y\\ yields 

S*{v,dft{x)) = f^{x;v) = inf < /J(f)||t;||, 

SO that dft{x) = A{t)x C for all (t,x) G [0, 1] x H. Let A\{t) be 

the Yosida’s approximation of A{t) given by 

Ax{t) ^^[Ih ~ JxA{t)]. 

By what has been demonstrated, the functions t Ax{t)g{t) are measurable 
when g{.) is measurable. Recall that 

\\Ax{t)x\\ < inf{||y|| : y G dft{x)} 

< sup{||y|i : y e 

= m 

for all (t,x) G [0,1] X H. Hence the operator subdifferential A{t) = dft 
satisfies {Hs){a),{b),{c). In particular, let us consider a closed convex 1- 
Lipschitzean moving set C{t) in H 

\d{x, C{t)) - d{y, C7(5))| < \\x - y\\ + \t - s\ 

for all x,y e H and for all t, s G [0, 1]. Then the subdifferential ddc(t) of 
the distance function dc(t) ’ x i-> d{x^C{t)) is a maximal monotone op- 
erator depending on t and satisfying (iTs). The preceding example will be 
useful in the next section. Another example ensuring the measurability of 
{Ih + XA{t))~^g{t) is in [22] where A{t) — dft is the subdifferential of 
a lower semicontinuous convex integrand ft{t G [0, Ij) defined on a Hilbert 
space. 

Theorem 2.4. Suppose that A{t) : E cc{E) U {0};t G [0,1], is an 
m-accretive operator, {Hi)^ {H 2 ), {H 4 ) are satisfied, and D is ball- 

compact, i.e the intersection of D with any closed ball of E is compact. Then 
for any p G C£:([— r, 0]) with (p(0) G B, there exist at least a strong solution 
u G Ce{[—t^ 1]) to the problem 



(P )< ^ -A{t)u{t) + F(t, r{t)u) a.e t G [0, 1], 

\u{s) = p{s), Vs G [-r, 0]; u{t) G F, Vt G [0, 1]. 
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Proof. Step 1. We will prove the theorem in the particular case when {H 4 ) is 
replaced by a stronger one: F : [0, 1] x r, 0]) — > cwk{E) is scalarly 
upper semicontinuous on [0, 1] x C£;([— r, 0]) and there exists a convex weakly 
compact seiK C E such that F{t, u) C K for all (f, u) G [0, 1] x CeU—t, 0]). 

We define a sequence of continuous function (un) in C£;([— r, 1]) such that 
a subsequence of its converges uniformly to a strong solution of (Pr)- Let n > 
1. We put Un = (f on [-r, 0] . As F is scalarly upper semicontinuous on [0, 1] x 
C£;([— r, 0]),F is, a fortiori, scalarly £([0,1]) ( 8 ) S(C£;([— r, 0])) -measurable. 
Let a : [0, 1] x C£;([-r, 0]) F be a scalarly £([0, 1]) (8) S(C£;([-r, 0]))- 
measurable selection of F. Let n > 1 be a fixed integer. In order to construct 
Un on [0, 1], we consider a partition of [0, 1] by the points = kcn^ = 
L, fc = 0, 1, 2, ..., n. We define Un by linear interpolation, where itn(ffc) = 
G D{A{t'^)), k = 0, 1,2, ...,n are obtained by induction starting with 
lin(O) = Xq = (f{0) G D. Let us set 

+ en(T{to, >f))- 



By construction we have G D{A{ti)) C D. For t Gjfo 5 us set 



Un (t) 



t t 
-^0 + - 



0 



Then for t £]^o > have 

Unit) = — — G +cr(io,<p). 

Cn 

By induction on 0 < fc < n we set 



(2.4.1) 



Xk+i ■= +e„CT(<^,r(t^)u„)). 



Then 6 C D for fc = 0, 1,2, ...,n. For ]f^, 0 < k < 

n — 1, we set 

Unit) 



fn _ f + #« 

^fc+1 ^ 

Xf.+ 



then, for we have 



(2.4.2) Unit) = ^ 



e -^(ffc+i)x^+i +CT(f^,r(f^)u„). 



For each t € [0, 1] and each n > 1, let 5„(t) = t^, 0„(f) = ffc+j, if f € 
]ffc , ffc+i] and 5„(0) = 0„(O) = 0. So by (2.4.2) we get 



(2.4.3) 



u„(f) € -A(^„(f))u„(0„(f)) +<T(<5„(f),r(5„(t))u„), 
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for a.e t e [0, 1]. It is obvious that, for all n > 1 and for all t E [0, 1], the 
following hold: 

(2.4.4) a{6nit),T{6n{t))Un) G F{Sn{t),T{6n{t))Un), 

(2.4.5) Un{Sn{t)),Un{0n{t)) e D, 

(2.4.6) lim Sn{t) = lim 9n{t) = t. 

n—>oo n-^oo 

Claim: [iin) is bounded a.e. 

Pick s > 0 such that ||v:^(0)l| < s < +oc. By {H 2 ) there is (5(s) such that 
ll'^e„^(io)<y?(0) - (/9(0)|| < ej{s). 

Let p as in (Hi). There is 7 > 0 such that ||/ 9 (t)|| < ^ for all t G [0, 1]. By 
using {Hi) and the preceding inequality, we obtain the estimate 

iK - ¥^( 0)11 < \\Je„A{q){xl_i+eMt2-i,r{tl_M) - J,„A{tM0)\\ 
+ \\J,^A{t]:)p{0) - J,^A{tM0)\\ 

+ \\J,„A{tM0)-m\\ 

< \\x]i_i - V3(0)|| + e„||a(tj(_i,r(t^_i)u„)|| 

+ anlL{\\p{Q)\\) + e„5(s) 

< \\x^_i - ^(0)11 + e„(|iT| + 7J^(||^(0)||) + <5(s)). 

Iterating the preceding inequality gives: 

(2.4.7) ||x^ - ^(0)11 < F,{\K\ + 7i(||v^(0)||) + S{s)), 
for all n > 1 and for all A: = 1 , 2, n. So we have 

(2.4.8) Ik^ll < /? := 11^(0)11 + \K\ + 7i(||^(0)||) + S{s), 

for all n > 1 and for all A; = 0, 1, 2, .., n. Using (2.4.8) and {H 2 ), there is 
C(/3) > 0 such that II Je„7l(t^)a:^ - a^^ll < e„C(/3)- So by (Hi) and (^ 2 ), we 
get the estimate 

li^fc+i ~ II — ll'^en^(^fc+i)(^fc A ena{tf., T{t’^)un)) ~ Jg„ A(f^_j,j)a;^|| 

+ ||Je„7l(t^+i)x]?-Je„A(f?):r^|| 

+ ||Je„A(tS)x^-X^|| 

< e„(|i('| + ^L{f3) + C(/3)) 




32 



C. Castaing, A.G. Ibrahim 



for all n > 1 and for all /c = 0, 1, 2, n. By (2.4.2) and the preceding estimate, 
we get 

(2.4.9) ||«„(t)|l < m := \K\ + jL{f3) + C(/3), 

for all n > 1 and a.e t € [0, 1]. That proves the claim. By (2.4.6) and (2.4.9), 
we have 

(2.4.10) lim ||u„(6»„(t)) - w„(t)|| = 0, 

n— >oo 

for all t e [0, 1]. By (2A8) K(<9n(t))) eDn Bj5;(0,/3) for all t G [0, 1] 
and for all n e N. As D is ball-compact, (un{0n{t))) is relatively compact 
in E for every t G [0, 1], so is (un{t)). Thus is relatively compact in 

C£;([— r, 1]). Hence we may suppose that {iin) weakly converges in L|;([0, 1]) 
to a function v with ||f(t)|| < m for a.e t G [0, 1], and (un) converges in 
1]) to a function u with 

u{t) = (p{0) [ v{s)ds, VtG[0, 1]. 

Jo 

As \\a{Sn{t),r{Sn{t))un)\\ < |iT|foralln> land for alHG [0,1], we may 
suppose that {gn{-)) '= (cr(5n(-) A(^n(-))^n)) converges weakly in L|([0, Ij) 
to a function g with ||^(t)|| < \K\ for a.e t G [0, 1]. 

Claim: v{t) G —A{t)u{t) -1- g{t) a.e t G [0, 1]. 

Let us set Wn{t) := Je^A{t){un{Sn{t)) CngniJ)) for all n > 1 and for all 
t G [0, 1]. Then Wn{t) G D{A{t)) for all t G [0, 1] and we have 

(2.4.11) H“ ejig^i^t^ G WYi(t^ -f- eYiA{t^Wji{Cj. 

In view of {H{) and (2.4.8) we have the estimate 

\\Un{0n{t)) - tt;n(^)|| < en\p{0n{t)) ~ p{t)\L{\\Un{Sn{t)) + en^n(OH) 
<en\p{en{t))-p{t)\L{P+\K\). 

It follows that 

(2.4.12) lim \\Un{0n{t)) - Wn{t)\\ =0, \/te [0, 1]. 

n—^oo 

Consequently we get 

(2.4.13) lim Wn{t) = u(t), \ft G [0, Ij. 

n— >oo 

Let n > 1 and let t g] 0, 1[. Then t e]t^, [ for some 0 < A: < 1. So, taking 

(2.4.1 1) and the preceding estimate into account, we get 
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d{-u„{t) + gn{t), A{t)Wn{t)) 



gn{t),A{t)Wn{t) 



< ~\\Un{0n{t)) - Wn(OII 

< \p{er^{t))-p{t)\L{(3 + \K\). 



Since A{t)wn{t) is closed and convex, the preceding inequality implies that 



-Unit) + gn{t) e A{t)Wn{t) + rn[t)BE 



a.e with rn{t) := \p{0n{t)) — p{t)\L{(3 + \K\) 0 for all t G [0,1]. As 

—Un{-) + gn{‘) -V g weakly in L|.([0, 1]) and Wn u strongly in 
L|;([0, Ij) by (2.4.8), (2.4.12) and (2.4.13), from Lemma 2.3, we deduce that 

v{t) G —A{t)u{t) + g{t) a.e t G [0, 1]. 



Claim: g{t) G F(t, r{t)u) a.e t G [0, 1]. Let t G [0, 1]. Using (2.4.9) we get 

\\r{Sn{t))Un - r{t)u\\cE(i-rg]) 

< \\r{6n{t))Un - r{t)Un\\cE{[-r^) + ~ r{t)u\\c^([-r,0]) 

< sup ||^n(5l) - 'Un(52)|| + \\r{t)Un ~ T (f )i/| ([_^^ 0 ]) 

{si,S2€[-r,l],|si-S2|<en} 

< sup ||un(si) -■U„(S 2 )|| 

{si,S2€[-r,0],|si-S2l<en} 

+ sup llun(si) - M„(S 2 )|| 

{si,S2G[0,1],|si— S 2|<en} 

+ \\T{t)Un - T{t)u\\c^([-rfi]) 

< sup ||¥?(si) -(/3 (s2)|| 

{si,S2G[-r,0],|si-S2|<en} 

+ men + \\T{t)Un - T{t)u\\cE{[-r, 0 ])- 

Using the continuity of (p, the uniform convergence of Un towards u, and the 
preceding estimate, we see that 

lim \\T{5n{t))Un - T{t)u\\c^([-rfi]) = 0. 

As^n(f) := (^{Sn{t)^r{5n{t))un) G F{Sn{t) , r{Sn{t))un) for all n > 1 and 
for all t G [0, l],6n{t) t in [0, 1] and, r{6n{t))un r{t)u in Cj 5 ([-r, 0]) 

pointwisely on [0, 1], by invoking the scalarly upper semicontinuity of F on 
[0, 1] X Ce{[—t, Oj) and the closure result in ([9], Theorem V-14), we conclude 
that g{t) G F(t, r{t)u) a.e. Thus we have completely proved the existence of 
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strong solutions of (Pr) in the case when F is globally scalarly upper semi- 
continuous on [0, 1] X Ce{[—t^ 0]). 

Step 2, Now let us go into the case when F satisfies Let us summarize 
the results obtained in the first step. Given p G C£;([— r, 0]) with (/?(0) G D, 
there is continuous mapping u G C£;([-r, 1]) such that its restriction to [0, 1] 
is an absolutely continuous solution of the evolution equation 

u{t) G —A{t)u{t) + F{t, r{t)u) a.e t G [0, 1], 

and u{s) = p{s), Vs G [— 't, 0]. Moreover we have u{t) = p{0) + 
Jq uds, G [0, 1] with u{t) e D,\/t e [0, 1] and ||'u(^)|| < m a.e t G [0, 1] 
where m is given by the estimate (2.4.9). Let denotes K equipped with the 
cr(P, E') topology. Let us denote by ck{Ka) the set of all nonempty convex 
compact subsets of By a recent version of multivalued Scorza-Dragoni the- 

orem [6], there exists a multifunction Fq : [0, 1] xC^;([-r, 0]) ^ ck{K(j)U{0} 
which enjoys the following properties : 

(1) there is a Lebesgue negligible set N C [0, 1] independent of (t, u) such that 
Fo{t, u) C P(t, u), Vt G [0, 1] \N, We CeU-t, 0]), 

(2) if X : [0, 1] 0]) and ^ : [0, 1] P are Lebesgue-measurable 

functions with y{t) G P(t, x{t)) a.e, then y{t) G Fo{t, x{t)) a.e, 

(3) for every e > 0, there is a compact subset C [0, 1] such that the 
Lebesgue measure of [0, 1] \ is less that s, and the graph of the restriction 
Fo\Je xC£;([-r, 0]) is closed in xC£;([-r, 0]) x and {0} ^ Fo{t,u) C 
P(t,u), V(t,u) e JeX Cj5([-r,0]). 

By (3) there exists an increasing sequence of compact sets ( J^) in [0, 1] such 
that the Lebesgue measure of [0, 1] \ Jn tends to 0 when n ^ oo and that the 
graph of the restriction of Fq on Jn xC£;([— r, 0]) is closed in Jn xC£;([— r, 0]) x 
Kfj (i.e, it is upper semicontinuous) and has nonempty values. 

Let Fn be a cfc( Per) -valued upper semicontinuous extension (see [2]) of 
Pol Jn X C£;([-r,0]) to [0,1] X C£;([-r,0j), with Fn(t,u) C Pa, for all 
{t,u) G [0,1] X CE{[-r,0]). 

We now apply step 1 with Fn substituted for P. Thus, using the above remark, 
for every n, there is a continuous mapping Un G C£;([— r, 1]) such that its re- 
striction to [0, 1] is an absolutely continuous solution of the evolution equation 

iln{t) G -A{t)Un{t) + Fn{t,r{t)un) a.6 t G [0, 1], 

with Un{s) = p{s)^ Vs G [— r*, 0] and Un{t) = v^(0) + Un(^)ds, \/t G 
[0,1] with Un{t) G P,Vt G [0,1], and \\un{t)\\ < m, a.e t G [0,1]. As 
D is ball compact, we may suppose that {un) converges uniformly to u G 
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1]), {iin) weakly* converges in L^([0, 1]) to ii with ||^(t)|| < m a.e, 
and u{t) = (/9(0) + u(s) ds, Vt G [0, 1]. We now finish the proof as follows. 
There is a measurable mapping Zn : [0, 1] K with Zn{t) G Fn{t,r{t)un) 
for a.e t G [0, 1] and 

iln{t) e -A{t)Un{t) + Zn{t), 

a.e t e [0, 1]. We may suppose that Zn ^ z weakly* in L-([0,l]).By con- 
struction, there is a Lebesgue null set Nn such that Zn{t) G Fo(f, r{t)un) for 
all t e Jn\ Nji. Let Nq := ([0, 1] \ UnJn) U (U^A^n) which is Lebesgue- 
negligible. If t ^ Nq, there is an integer p := p{t) such that Zn{t) G 
Fo{t,r{t)un) for all n > p{t). As Fq is scalarly upper semicontinuous on 
Jp and r{t)un — > r{t)u in CeU-v, 0]), we have 

limsup(x', Zn(^)) < lim sup 6* {F , Fo{t, r{t)un)) < Fo(t, r(f)ii)), 

n n 

for all x' G and for all n> p. Thus 

limsup(x', 2 :n(^)) < S* {F , Fo{t,r{t)u), 

n 

for a.e t G [0, 1]. It follows that, for every measurable set A G [0, 1] and for 
every F e E' , 

lim / {F,Zn{t))dt= / {F,z(t))dt< / 6 * {F , Fo{t,T{t)u) dt, 

^ Ja Ja Ja 

using Fatou’s lemma. Consequently z{t) G Fo(t, r{t)u) C F(t, r{t)u) a.e. 

□ 

We have immediately the following corollaries which are new and will be 
used later. 

Corollary 2.5. Suppose that A{t) : E cc{E) U {0};f G [0, 1], is an m- 
accretive operator, {Hi), (F 2 ), {HFj ore satisfied, and D is ball-compact, K 
is a convex weakly compact subset of E and f : [0, 1] x CeU—t, 0]) ^ K is a 
Caratheodory mapping, that is, f is separately measurable on [0, 1] and sepa- 
rately continuous on Ce{[—t,0]), then for any p G C£;([— r, 0]) withp{ff) G D, 
there is a continuous mapping u G Ce{[—t, 1]) satisfying 

(P ) I ^ -f f(t, T{t)u) a.e t G [0, 1], 

|u(s) = Vs € [-r, 0]; u{t) G D, 'it e [0, 1]. 

Another immediate corollary of Theorem 2.4 concerns an undelayed dif- 
ferential inclusion governed by an m-accretive operator of the form 
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( . I u(t) e -A(t)u(t) -f G{t, u{t)) a.e t e [0, 1], 

^ ''^\u{ 0 )=uoeD, 

where G is a convex weakly compact valued perturbation defined on [0, 1] x E. 

Corollary 2.6. Suppose that A{t) : E cc{E) U {0};t G [0, 1], is an m- 
accretive operator, {H 2 )^ (^ 3 )) satisfied, and D is ball-compact, K 
is a convex weakly compact subset ofE and G : [0^l]x E ^ K is a separately 
scalarly measurable on [0, 1] and separately scalarly upper semicontinuous on 
E, then for any uq G D, there is an absolutely continuous solution u : [0, 1] 

E to 

( . f u{t) G -A(t)u(f) + G{t, u{t)) a.e t G [0, 1], 

^ |ix(0) =UoED. 

For single valued perturbation G{t,x) = {g{t,x)}, Corollary 2.6 is re- 
duced to the following 

Corollary 2.7. Suppose that A{t) : E — > cc{E) U {0};f G [0, 1], is an m- 
accretive operator, (^^ 2 ), {H 3 ) cire satisfied, and D is ball-compact, K is 
a convex weakly compact subset ofE and g : [0,l]xE ^ K is a Caratheodory 
mapping, that is, g is separately measurable on [0, 1] and separately continu- 
ous on E, then for any uq G D, there is an absolutely continuous solution 
u : [ 0 , 1 ] ^ E to 

(P ') I ^ + g{t, u{t)) a.e t e [0, 1], 

^ \w(0) = «o € 

The following is a viability result illustrating the techniques developed in 
the proof of Theorem 2.4. See [19] for the existence of BV solutions for an 
evolution equation governed by a maximal monotone operator A{t) with time- 
dependant domain in Hilbert space. 

Proposition 2.8. Suppose that A{t) : E cc{E) U {0}; t G [0, 1], is an m- 
accretive operator satisfying {Hi), {H 2 ), {Hfj and C : [0,1] E is a closed 
ball-compact valued, (that is, Vt G [0, l],G{t) is closed and its intersection 
with any closed ball ofE is compact) upper semicontinuous multifunction sat- 
isfying: (*) 0 G G{t),forall t G [0, 1] and (**) [Ie + XA{t)]~'^C{s) C C{t), 
for all X> 0 and for allO < s <t <1. Then for any a G C(0) fl D, there is 
an absolutely continuous solution u : [0,1] E to 

f ii{t) G -A{t)u{t) a.e t G [0, 1], 

\^u{0) = a, u{t) G C{t) DD, Vt G [0, 1]. 

Proof That is an easy adaptation of the proof of Theorem 2.4, so we do not 
want to go into details too much. We will follow the discretization technique 
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developed in Theorem 2.4. By (**) we first observe that [lE+XA{t)]~^C{s) C 
C{t) n D, for all A > 0 and for all 0 < 5 < t < 1. We define a 
sequence of absolutely continuous functions (un) in C£;([0, 1]) such that a 
subsequence of its converges uniformly to an absolutely continuous solution 
of the equation under consideration. Let n > 1 be a fixed integer. In or- 
der to construct Un on [0, 1], we consider a partition of [0, 1] by the points 
= kcn, en = = 0, 1, 2, ..., n. We define Un by linear interpolation, 

where ^ n /c = 0, 1, 2, ...,n are obtained by 

induction starting with i^n(O) = Xq = a e C(0) fl D{0). Let us set 

Xl ■■= Jcr.Ait'Dxl. 



We have x" G C(^") n D{A{Vl)) C C(t") n D. For t £]tQ, i"], let us set 



/ N -t 
Unit) = i 






-■1 ■ 



Then for t G]fo , [, we have 

Unit) = e 

Cn 

By induction on 0 < A; < n we set 



( 2 . 8 . 1 ) 



3^k+i ■= 



Then G n DiAiq^^)) C 0 for A: = 0, 1, 2, ..., n. 

For ]t^, 0 < A: < n - 1, we set 



Unit) 



— f 

^/e + 1 ^ 



+ 



t-t]^ 



X 



n 

k-\-l "> 



then, for [, we have 



(2.8.2) Unit) = e -A(A^+i)x^+i. 

For each t G [0, 1] and each n > 1, let 5„(t) = ^„(i) = AjJ+i, if A G 

]Afc, and 5„(0) = 6*„(0) = 0. So by (2.8.2) we get 

(2.8.3) Unit) G -Ai0nit))UniOnit)), 



for a.e t G [0, 1]. It is obvious that, for all n > 1 and for all t G [0, 1], the 
following hold; 



(2.8.4) 



UniSnit)) G CiSnit)) H D, u„(6>„(A)) G C(6»„(A)) n D, 
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(2.8.5) 



lim 6n{t) = lim On{t) = t. 

n-^oo n^oo 



Claim: {iin) is bounded a.e. 

Pick s > 0 such that ||o|| < s < +oo. Since the operator A(to) satisfies (H^), 
there is <5(s) such that 



\Ue„A{tQ)a - a\\ < enS(s). 

Let p as in {Hi). There is 7 > 0 such that ||p(i)ll < 2 ^ ^ 

using {Hi) and the preceding inequality, we obtain the estimate 

\\xl-a\\<\\H„A{tl)xl_i-J,^A{tl)a\\ 

+ \\J,^A{tl)a-J,^A{t^)a\\ 

+ ~ ®ll 

< - a|| + e„'yL{\\a\\) + e„d{s) 

= \\Xk-i - all + e„( 7 L(||a||) + <5(s)). 

Iterating the preceding inequality gives 

(2.8.6) lla;^ - a|| < t^{'yL{\\a\\) + <5(s)), 
for all n > 1 and for all A: = 1, 2, ..., n. So we have 

(2.8.7) l|a;^|| < /? := ||a|| + 77'(l|a||) + S{s), 

for all n > 1 and for all k = 0, 1,2, ..,n. Using (2.8.7) and {H 2 ) there is 
C(/?) > 0 such that || Je„^(to)a;^ - x^|| < e„C(/5)- So, by {Hi) we get the 
estimate 

Ikfc+l - ^fcll < II Je„^(f^+l)xJ? - Je^A{t^)x]^\\ + \\Je^A{t^)x-, - :T^|| 

< e„(7L(/3) + C(/3)) 

for all n > 1 and for all fc = 0, 1, 2, .., n. By (2.8.2) and the preceding estimate, 
we get 

(2.8.8) ||u„(t)||<m:=7L(/3) + C(^), 

for all n > 1 and a.e t G [0, 1]. Hence the claim is proved. It is obvious that 
{un) is equicontinuous. By (2.8.5) and (2.8.8), we have 

(2.8.9) lim ||un(<9n(0) - ^n(0ll = 0^ 

n— >oo 

for all t E [0, 1]. By construction and (2.8.7), we have x'^ e C{t^) fi 
for all n > 1 and all fc = 0, 1, 2, ...,n. So, for every t G [0, 1], we have 
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Un{0n{t)) ^ C{6n{t))npBH- In vicw of (*), upper semicontinuity of C{.) and 
ball-compactness assumption on the closed sets C{t), we see that the nonempty 
compact valued multifunction C{.) fi i3Bh is upper semicontinuous on [0, 1]. 
Thus, by (2.8.5) we conclude that the sequence (un{0n{t)) is relatively com- 
pact in H. So, by Ascoli’s theorem, the sequence (un) is relatively compact in 
C//([0, 1]). It is obvious by using (2.8.8), that the sequence {iin) is relatively 
weakly compact in T|^([0, 1]). Hence we may suppose that {iin) weakly con- 
verges in T|^([0, 1]) to a function v with ||t’(^)|| < m for a.e t G [0, 1], and 
{un) converges in C/f ([0, 1]) to a function u with 

u{t) = a+ f v{s)ds, VfG[0, 1]. 

Jo 

Claim: v{t) G —A{t)u{t) a.e t G [0, 1] and conclusion. 

Let us set Wn{t) := Je^A(t){un{5n{t)) for all n > 1 and for all t G [0, 1]. 

Then we have 

(2.8.10) Un(Sn(t)) G Wn(t) + enA(t)Wn(t). 

In view of (Bi) and (2.8.7) we have the estimate 

||tXn(<9n(0) - ^n{t)\\ < en\p{0n{t)) ~ p{t)\L{\\Un{Sn{t))\\) 

< en\p{0n{t)) - p{t)\L{(3). 



It follows that 

(2.8.11) lim \\Un{0n{t)) - Wn{t)\\ ^ 0, VfG[0, 1]. 

n—^oc 

Consequently we get 

(2.8.12) lim Wn{t) = u{t), \/t G [0, 1]. 

n-^oo 

Let n > 1 and let t g] 0, 1[. Then t some 0 < /c < 1. So, by 

taking (2.8.10) and the preceding estimate into account we get 

d{-Un{t),A{t)Wn{t)) = d ( ,A{t)Wn{t)\ 

< —\\Un{0n{t)) - Wn{t)\\ 

< \p{&n{t)) - p{t)\L{(3). 

As \\p{6n{t)) — p{t)\\L{P) 0 for all t G [0,1], Un{‘) — ^ v weakly in 

Ij) and from (2.8.7)-(2.8.1 1)-(2.8.12) and Lebesgue’s theorem, Wn 
u strongly in L^([0, 1]), by using Lemma 2.3, we get 
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v{t) G —A{t)u{t) a.e t G [0, 1]. 

As the graph of the compact valued upper semicontinuous multifunction C(.) fi 
(5Bh is closed, from (2.8.9) and our construction we see that u(t) G C{t) fi 
(3Bh n D. The proof is therefore complete. □ 

The following is a uniqueness result whose proof is borrowed from ([7], 
Proposition 2.2). 

Proposition 2.9. Suppose that A{t) : E ^ 2^; f G [0, 1], is an m-accretive 
operator. Let g : [0, 1] x C^;([-r, 0]) ^ E* such that: 

(i) there exits d> 0 such that \ \g{t, u) — g{t^ t’)!! < d\\u — v\\c^([-r,o]) far all 
t G [0, 1] and for all u^v e C£;([— r, 0]), 

(ii) far every u G C£;([— r, 0]), the function p(., u) is integrable on [0, 1]. 

Let -0 G C£;([— r, 0]) with '0(0) G D. Ifu\ and U 2 are strong solutions for the 
problem 



I ii{t) G -A{t)u{t) + g{t^ a.e t G [0, 1], 

[u{s) = e [-r,0], 

then \\ui — U 2 \\t =0, Vt G [0, 1] where ||.||t denotes the norm o/C£;([— r, t]). 

Proof. Let U\^U2 be two strong solutions of the problem under consideration. 
Then we have g(t,T{t)ui) - ui{t) G A{t)ui(t) and g{t,r(t)u 2 ) - U 2 {t) G 
A{t)u 2 {t) a.e. Since A{t) is accretive, then 

(2.9.1) {ui{t) -U 2 {t),g{t,r{t)ui) - iii{t) - g{t,r{t)u 2 ) A U 2 {t))^ > 0, 

a.e, where (., .)+ denotes the usual upper-semi-inner product on E. It is well 
known (see e.g [31]) that (2.9.1) can be written as 

(2 9 2 ) ~ U2{t),9{t,r{t)ui) - ui{t) - g{t,T{t)u2) + U2{t))+ 

== - U2{t)),g{t,T{t)ui) - Ui{t) - g{t,T{t)U2) + U2{t)), 

where j : E E' is the single-valued duality mapping. So (2.9.2) is equiva- 
lent to 

{j{Ul{t)-U2{t)),Ul{t)-U2{t)) < {j{ui{t)-U2{t)),g{t,T{t)ui)-g{t,T{t)u2)). 
By ([31], Lemma 1.4.2(ix)) we have 

- Mt)),g{t,r{t)ui) - g{t,T{t)u 2 )) 

<d\\ui{t) - U 2 {t)\\ \\T{t)ui - r(f)u2||cE([-r,0]), 
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for a.e t e [0, 1]. Integrating on [0, s] we obtain 

^||ttl(s) - U 2 (s)|p < d / ||t(6>)mi - T{9)u2\\cE{[-r,0]) 11^1 W - W 2 (d)|| d6>, 

0<s<t. 

Since u\ = 'ijj — U 2 in [-r, 0], the last inequality implies that 

\ld9, 

where ||.||(9 denotes the norm of Cj 5 ([-r, ^]). Because 6 ^ \\ui - U 2 W 0 is 
continuous, by applying Gronwall’s lemma we conclude that 

ll'^i - U 2 \\t = 0 , yt e [ 0 , 1 ]. 

Hence ui = U 2 in [-r, 1]. □ 

The following deals with compactness of solutions set for problems (Pq)- 
For any cc(P)-valued measurable multifunction F : [0, 1] — > cc{E), let us 
denote by := {/ G L|([0, 1] : f{t) e T{t) a.e}. 

Proposition 2.10. Suppose that A{t) : E cc{E) U {0};^ G [0, 1], is an 
m-accretive operator, (Pi), (^2)5 (P3) satisfied, and D is ball-compact. 
Then the following hold: 

a) For any xq G P and for any convex weakly compact subset T, the strong 
solutions set {u f : f ^ S^} of the problem 

(P ) I ^ -A(t)u(t) + f(t) a.e t G [0, 1], 

° |ix(o) = xo, u{t) e D, yt e [ 0 , 1 ], 

is a nonempty compact subset o/C£;([0, 1]). 

b) For any xq G P and for any convex weakly compact subset F, for any convex 
weakly compact valued multifunction G : [0,1] x P — > F, separately scalarly 
measurable and separately scalarly upper semicontinuous, the strong solutions 
set of 

(P ) I ^ -A(t)u(t) + G(t, u(t)) a.e t G [0, 1], 

° [u(0) = xo, u{t) G P, Vf G [0, 1], 

is a nonempty compact subset 6>/C£;([0, 1]). 

Proof, a) An easy inspection of Step 1 of the proof of Theorem 2.4 shows that 
the strong solutions set {li/ : / G 5p} of 



I? < d / ||ui -U2I 
Jn 
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(P ) I ^ + f{t) a.e t e [0, 1], 

^ U(0) = Xo, u{t) G A Vt G [0, 1], 



is not empty. Moreover, we have Uf{t) = xq + J^Uf{s)ds e e 

5p, G [0,1] with ||^/(t)|| < M for all / G 6p and for all t G [0,1], 
where M is a positive constant, using the estimate (2.4.9) for this particular 
case. By Ascoli’s theorem, {uj : / G S^} is relatively compact in C^([0, 1]). 
Let (/n) be a sequence in 5p. There is a subsequence (gm) of (/n) such 
that (gm) converges (j(L|([0, 1]), L|,([0, 1])) to h e 5p. We may also sup- 
pose that (%^) converges cr(L|;([0, 1]), L|;,([0, 1])) to a measurable func- 
tion V with ||'?;(t)|| < M a.e and converges to G C£;([0, 1]) with 



u{t) = xo + Jq ii{s) ds for all t G [0, 1], and ii = v a.e. By Lemma 2.3 we see 
that u{t) G -A{t)u{t) -f h{t) for a.e t G [0, 1]. 



b) It is easy to see that the strong solutions set to problem (Pq) is relatively 
compact in Ce{[0, 1]), using a). Let (ttn) be a sequence of strong solutions 
to (Po) converging uniformly to a function Uoo G Ce{[0^ 1]). We have that 
'Uoo(O) = xq. There is gn G satisfying (*) G -A{t)un{t) + gn{t) 

and (**) gn{t) G G{t,Un(t)) for almost every t G [0, 1]. By extracting sub- 
sequences, we may suppose that {gn) converges g{L\,L\,) to goo G 
and iin converges a{L\,L\,) to Uoo- By (*) and Lemma 2.3, we have that 
iioo{^) ^ A{t)uoo{t) H- goo{t) for almost every t G [0, 1]. Using (**) and ap- 
plying the closure theorem in ([9], Theorem VI-4), gives goo{t) G G{t, Uoo it)) 
for almost every t G [0, 1]. This proves that Uoo is a strong solution of (Pq)- 

□ 



3. Applications 

We present some typical problems in Optimal Control theory. Let Z be a 
compact metric space, k{Z) is the set of all compact subsets of Z, and let 
M]_(Z) be the set of all probability Radon measures on Z. It is well-known 
that A4^^(Z) is a compact metrizable space for the cr(C(Z)',C(Z)) -topology. 
We need first a density result. 

Proposition 3.1. (Castaing-Valadier 1971, unpublished.) Let (T, T, //) be a 
complete probability space and let Z be a compact metric space. Let T : T ^ 
B{Z) be a multifunction with measurable graph and let 

m ■= G MliZ) : t/(r(t)) = 1} 

for all t ^ T. Let A G <Ss (where is the set of all T -measurable selections 
ofE) and G\ = fj^St 0 Xt p^{dt). Let e > 0^ {gi : i G I) and {hi : i e I) 
two finite sequences in P^(T, p) and C{Z) respectively. If p is nonatomic, then 
there exists a T -measurable selection o/T, such that 
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\{(Jx,gi^hi) - {(!(;, g^^hi)\ < e, Vi G / 



where — fj. St 0 g(dt). 

Proof. Let us set M sup^^j fj. \gi{t)\ There exists a finite Borel 

partition {^j,j G J) of Z such that for any pair z^z' in ftj, the follow- 
ing holds \hi{z) - hi{z')\ < Vi G L Let (j : Z bo 2 l T- 

mesurable mapping defined by: (j{t) G r(t) H Qj if r(t) Pi Qj ^ 0, and 
Cj (t) G r(t) otherwise; whose existence is ensured by Von Neumann- Aumann 
measurable selection theorem. Let us set Uf := T>j^jXt{fdj)S^.(^t)i Vf G T. 
Then it is obvious that t Ut is scalarly T -measurable (alias Young mea- 
sure). Then (8) hi) = 'ZjeJ Jt ^t{^j) 9 i{t)hi{(j{t)) /j.{dt). Further we 

have {ax - (Ju.gi 0 hi) = frj.gi{t){Xt - Vt,hi) ii{dt) with {Xt - Vt^K) := 
/p(^)[/i^(z) — '0j^jla^{z)hi{uj{t))] Xt{dz). As the integrand in the preceding 
integral is estimated by we get |(cta - cFy,gi0hi)\ < e for all i G /. As 
/i is nonatomic, by Ljapunov’s theorem, there exists a T-measurable partition 
{Tj , j G J) of T such that 

[ Xt{nj)gi{t)h{Cj{t)) fi{dt) = [ g^{t)K{C,j{t)) g.{dt), 

JT JTj 

for all i G / and j G J. Let us set C,{t) = Cj(f) if f 8 Tj, j G J and 
= fj^St 0 5^(^t) l^{dt). Then we have 

{a^,gi 0 hi) = {a^.gi 0 hi), 

for all i G /. It follows that 



l(c^A,^^ 8 hi) - {a^,gi 0hi)\< e, 



for all i G /. □ 

In view of the preceding result, the set Sr of all Lebesgue-measurable se- 
lections (alias original controls) of T is dense for the cr(L^^y ([0, 1]), L^^^^([0, 1])) 
topology in the set S^. of Lebesgue-measurable selections (alias relaxed con- 
trols) of the Lebesgue-measurable multifunction E : [0, 1] — > M\{Z) defined 
by 

E(f) :={vGM\{Z):v{V{t))^l] 

for all t G [0, 1]. 

Let us consider a mapping ^ : [0, 1] x T x Z — > T satisfying: 

(i) for every fixed t G [0, l],g{t, .) is continuous on T x Z, 

{ii) for every (x, z) G E x Z, g{.,x, z) is Lebesgue-measurable on [0, 1], 
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(m) for every rj > 0, there exists l{r]) > 0 such that \ \g{t, x, z) — g{t, y,z)\\ < 
l{r])\\x - y\\ for all t e [0, 1] and for all (x, y, z) G Be{ 0, rj) x ^£;(0,?7) x Z 
where ^£;(0, g) denotes the closed ball in E with center 0 and radius g, 

{iv) there exists c > 0 such that g{t, x, Z) c cBe for all (f, x) G [0, 1] x E. 

Let us set 

/i(t,x,z/):= / g(t^x^z)v{dz)^ 

Jz 

for all (t, X, v) G [0, 1] X E X M\{Z). Then h inherits the properties of the 
function g. Namely, we have 

(i)' for every fixed t G [0, 1], h{t^ .) is continuous on E* x M\{Z), 

{ay for every (x,i/) G E x M\{Z),h{.,x,v) is Lebesgue-measurable on 

[ 0 , 1 ], 

(m)'for every ?7 > 0, there exists ^(ry) > 0 such that \\h{t^x^ z)—h{t,y^ z)\\ < 
/(7y)||x - y\\ for all t G [0, 1] and for all {x,y,u) G BE{0,g) x BE{0,g) x 
M\{Z) where Be{^, g) denotes the closed ball in E with center 0 and radius 
r], 

{ivy there exists c > 0 such that h{t,x,M\{Z)) c cBe for all (t,x) G 
[0, 1] X E. 

In the remainder of we will assume that F : [0, 1] Z is a compact 
valued measurable multifunction. Now let A{t) be a m-accretive operator in 
E satisfying {Hi), (E2), (E3) with D := D{A{t)) ball-compact. We aim to 
compare the strong solutions set {So) := {u^ :( e Sr} of the evolution 
equation: 

X LcW e -A(t)u^(t) + g(t,Ut;(t),C(t)) a.e t G [0,1], 

^ |«c(0) = xo G £), 

where ( belongs to the set Sr of all original controls, with the strong solutions 
set {Stz) •= {u\ : A G <Se} of the evolution equation 

/r, X ^ -A(t)ux{t)+ f g{t,ux{t),z)\t{dz) a.ete [0,1], 

(Pn) < _ Jr(t) 

La( 0) = xoED, 

where A belongs to the set of all relaxed controls. In view of Propositions 
2.9-2.10 b), for each A G (Se, there is a unique solution u\ to the problem 

/n X e -A(f)uA(f) -I- / g{t,u\{t),z)Xt(dz) a.e t e [0,1], 
{Pn) < _ Jm 

[ua( 0) = xoeD. 
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Let us set if (t, x) := h{t, x, E(t)), V(t, x) G [0, 1] x E. Then it easy to check 
that if is a separately scalarly measurable and separately scalarly upper semi- 
continuous convex compact valued multifunction satisfying if(t,x) C cBe 
for all (L x) e [0, 1] X E. By Proposition 2.10 b) the strong solutions set of 

j u{t) G -A{t)u{t) + if (L '^(0) ^ ^ [0^ 

\u{0) =xo e D, 

is compact in C^;([0, 1]). Using measurable selection theorem, we see that (Sn) 
coincides with the strong solutions set of the preceding differential inclusion. 
So we conclude that the solutions set {ua : A G 5s} of (Pn) is compact in 
Ce([ 0, 1]). It is obvious that the solutions set : ( G 5r} is a nonempty 
subset of {ua • A G 5s }• Now comes a relaxation property for the problem 
under consideration. 

Theorem 3.2. Assume that E is a separable Hilbert space, A{t) is a maxi- 
mal monotone operator in E, and (ifi), {H 2 )^ (ifa) satisfied with D := 
D{A{t)) ball-compact. Then the solution sets (So) = :( e Sr} is dense 

with respect to the uniform norm in the compact solutions set (Sn) = {^a • 
A G 5s}. 

It is clear that Theorem 3.3 follows from 

Lemma 3.3. The graph of the single-valued mapping X u\ defined on the 

([0, 1])) compact set 5s is closed. 

Proof Let A"^ A"^ for the ([0, 1]), 1])) topology and 

u\n — > Uoc in Ch{[0, 1]) where (n G N U {cxo}) is the unique absolutely 
continuous solution of the equation 

(uxn{t) G -A{t)uxn{t) + / g{t,uxn{t),z)XKdz) a.e t G [0,1], 

< _ Jr{t) 

[?iA-(0) = xq e D, 

then Uoc = ux ^ . For simplicity we set 

hx{t,x)= / g{t,x,z)Xt{dz) = / g{t,x, z)Xt{dz) h{t,x, Xt) 

Jr{t) Jz 

for all (t, X, A) G [0, 1] x x 5s- We have 

hxn{t,UXn{t)) - UXn{t) G ^(f)UA^(f), 

and 

hx^ (L ux^ {tf — ua^ (f) ^ A(t^ux^ {t)^ 
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for a.e t € [0, 1]. Since A{t) is monotone, 

(ua- {t) - Uxoo (t), h\n (t, WA" (t)) - MA" (t) - /lA~ (t, WA°= (t)) + U\oo (t) > 0, 

a.e in [0, 1]. So 
1 d 

-^-^\\uxn{t)-uxoo{t)\\'^ < {uxr^ {t) -ua~ {t) ,hxn{t,uxn{t))-hx^{t,ux^{t))), 

a.e in [0, 1]. Integrating on [0, t] gives 

^||ua"W -ua~WII^ 

< / {uxr^is) -uxoo{s),hxn{s,uxn{s)) - hxoo{s,uxoo{s)))ds. 

Jo 

Let us set 

Ln{t)= / {uxn{s)-uxoo{s),hxn{s,uxr^{s))-hxoo{s,uxoo{s)))ds. 

Jo 

Then Ln{t) = L^(f) + Ll{t) + Ll{t) where 

/ {uxr^{s)-uxoo{s),hxn{s,uxr^{s))-hxn{s,uxoo{s)))ds, 

Jo 



/ {uxr^{s) -Uoo{t),hxr^{s,uxoo{s)) -hxoo{s,uxoo{s)))ds, 

Jo 




Uxoo (5), hxn (s, Uxoo (s)) - hxoo (5, Uxoo (s))) ds. 



As \\hx{t, x)\\ < c for all A G and for all (f, x) e [0, 1] x if, using (iv)', 
we get the estimation 



|in(0l — 2 c||u-a^ '^co||ch([0,1])’ 



Thus ^ 0 uniformly in [0, 1], when n ^ 00. Similarly by (iv) the 

integrand 

f{s,z) := {uoc{s) - uxoo{s),g{s,uxoo{s),z)) 

is estimated by 

|/(S,^)| < c\\Uoc -'?^A-||ch([0,1])’ 

for all {s,z) e [0,1] x Z. Hence / G 1]). As (A’^) converges 

Lj(^)) to A^, it is immediate that for every t G [0, 1], 
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f if fis,z)X'^{dz)]ds f [f f{s,z)\'^{dz)]ds 
Jo Jz Jo Jz 

when n oo. So liniri-^oo ~ 0- Further there is 77 > 0 such that 

sup{||tXA||cH([o,i]) : A G 5 e} < ry < 00. 

Using {iv)\ there is l{r]) > 0 such that 

Jo 



Finally we get 
1 

-\\uxr>{t)-uxo.{t)\f <Ll{t) + Ll{t) + ;(r?)||wA"(s) - WA~(s)|pds. 

As the functions and L^(.) are continuous with ^ 0 and > 

0, by Gronwall’s lemma we conclude that 



UXn{t) UXoo{t) 

for all t e [0, 1]. So, ux^ — u^o and the set {txa : A G <Ss} is compact in 

C^([0,1]). □ 

There is variant of Theorem 3.2. 

Example 3.4. Let E be a separable Hilbert space, U be a convex weakly com- 
pact subset of H and let A : [0, 1] U be a convex weakly compact valued 
upper semicontinuous multifunction. In view of ([9], Theorem IV. 16) the graph 
of multifunction ext (A) : t ext{A{t))^ where ext{A(t)) denotes the set of 
all extreme points of A(t), is a Borel subset in [0, 1] x V. So, one can ap- 
ply Proposition 3.1 by taking (T,T,ii) = ([0, 1], >C([0, 1]); A), Z = V^eak, 
r — ext{A) and E = ^ where 

^(t) {u G M\{V) : u{ext{A{t)) = 1}, 

for all t G [0, 1]. Repeating the above arguments, we conclude that the strong 
solutions set (Sext(A)) of the equation 

/p e -A(t)u((t) +ff(t,u^(t),((t)) a.e t e [0,1], 

lu^(0)=xoeB, 



where ( belongs to the set of all Lebesgue-measurable selections of the multi- 
function ext(A(.)) is dense for the uniform convergence in the strong solutions 
set (<S^) of 
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f U\{t) e -A{t)ux{t) + [ g{t, Ux{t), u)\t{du) a.e t e [0, 1], 

[P^)< _ Jext(A(t)) 

[«a(0) = xq£D, 

where A is a Lebesgue-measurable selection of the multifunction ^ defined 
above. 

In the same vein, we have the following which concerns an analogous re- 
laxation property for a quasi-autonomuous evolution equation governed by an 
m-acretive operator. 

Example 3.5. Let A{t) be an cc{E ) U {0} m-accretive operator satisfying {Hi ) , 
{H 2 ), {Hs) with D := D{A{t)) ball-compact, AT be a convex weakly compact 
subset of E and let F : [0, 1] ^ K be a convex weakly compact valued upper 
semicontinuous multifunction. As above the graph of multifunction ext{F) : 
t ext{F{t)), where ext{F{t)) denotes the set of all extreme points of F{t), 
is a Borel subset in [0, 1] x K. We claim that the strong solutions set {Sext{F)) 
of the equation 



{Pext(F)) 



Uf{t) G -A{t)uf{t) -h f{t) a.e t G [0, 1], 

Uf{0) = xo e D, 



where / belongs to the set of all bounded Lebesgue-measurable selec- 

tions of the multifunction ext{F) is dense for the topology of uniform conver- 
gence in the the compact solutions set {Sf) of the equation 



(p ) i ^ + /(^) t e [0, 1], 

^ |w/(0) =xq e D, 

where / belongs to the set S’^ all bounded Lebesgue-measurable selections of 
the multifunction F. First, observe that, in view of Corollary 2.6 and Propo- 
sition 2.9, for each / G 5^, there exists a unique strong solution Uf for the 
equation 

iuf{t) G -A{t)uf{t) -h f{t) a.e t G [0, 1], 

\uf{0) = xo e D. 

Second, it is well-known (see e.g [9]) that is dense in for the 

cr(Lg', L^,) -topology, using Ljapunov’s theorem. By Theorem 2.10 a), {Sf) 
is relatively compact in Ce{[0, 1]). So, it is enough to remark that the graph 
of the mapping f Uf from equipped with the topology cr{L'^, 
topology into C^;([0, 1]) equipped with the topology of uniform convergence 
is closed, repeating the arguments given in the proof of Proposition 2.10 b). 
Details are left to the reader. □ 
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Comments. If E is a finite dimensional space, ball-compactness assumption on 
D in Theorem 2.4 is superfluous. The study of undelayed differential inclusions 
governed by m-accretive operators arises from Economics [10, 16], Mechanics 
[20, 21] and Control Theory [8, 14, 17, 29, 30]. Other results in relaxed control 
theory for both second order ordinary differential equations and second order 
multivalued differential equations appear in a forthcoming paper by Azzam- 
Castaing-Thibault. 

To end this paper, we present now the existence of absolutely continuous 
solutions for a special class of evolution equation in a separable Hilbert space 
H. Here compactness assumption is unnecessary by constrast to the results 
presented above. 

Proposition 3.6. Let H be a separable Hilbert space. Let f : [0,1] x H R 
satisfying: 

(1) for every x e H, /(., x) is Lebesgue-measurable, 

(2) there exists f3 G ([0, 1]) such that for all t G [0, 1] and for all x, y in H 

\f{t,x) ~ f{t,y)\ < P{t)\\x-y\\, 

(3) for every t G [0, 1], the function f{t^.)is convex. 

Then, for any a G there is a unique absolutely continuous solution to the 
equation 

iu{t) e -dft{u{t)) a.e t e [0, 1], 

[ u(0) = aeH. 



Proof. Let A{f) = dft- According to (1), (2), (3) and the example just after 
Lemma 2.3, the maximal monotone operator A{t) satisfies assumption (H^). 
In particular, the mappings t ^ J\A{t)x and t A\{t)x are measurable on 
[0, 1] for every A > 0 and every x e H. Note that for each A > 0 the equation 



dux 

dt 



(t) -h Ax{t)ux{t) = 0, 



u\{0) = a e H, 



admits a unique absolutely continuous solution because A\{t)x is separately 
measurable and separately Lipschitzean and satisfies the estimate \ \A\ {t)x\\ < 
for all A > 0 and for all (t, x) G [0, 1] x H. 

Now let (Xn) be a decreasing sequence of positive real numbers such that 
Xn i 0. For each integer n there exists a absolutely continuous solution 
u\^ : [0, 1] if to the equation: 
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\«A„(0) = a. 

Furthermore we have the estimate 

Vn G N, ||^a„||l^([o,i]) < II/?IIl2([o,i])- 

Let m and n be any positive integers. We have 

(3.6.1) = 2(«a„(0 - - wa.„(0) 

= 2{XnAx^{t)ux„{t) - XmAx^{t)ux^{t),ux^{t) - Ux^it)) 

+ 2(^A„(i)«A„(^) - «^A„(i)wA„(0-«A„(0 - UXrr.it))- 

Here J\{t) stands for J\A(t). Notice that 

(3.6.2) -ilXrrr (0 = ^A„ (i)«A„ (0 € A{t)Jx,rr (*)«A„ {t) 
and 

(3.6.3) -wa„ it) = Aa„ (0«a„ it) G A{t)Jxrr (i)«A„ it)- 
From (3.6.2), (3.6.3) and the monotonicity of A{t)^ it follows that 

(3.6.4) ( Ja„ (t)UA„ it) - JXrrr it)UXm it), U-Xn it) ~ «A„ (0) < 0. 
Using (3.6.4) and coming back to (3.6.1) we get 

f^WuxAt)- UXrr.it) f 

(3.6.5) < 2{XnAXrrit)Ux,rit) - >^mAx„rit)UX,r,it),UXrrit) ~ «A„(i)) 
= -2{XnUXr.it) - Am«A„(i), WA„(<) “ WA„(i))- 

Integrating inequality (3.6.5) on [0, t] gives 



(3.6.6) 

< —2{XnU\^{.) — XmUx^{.)^U\^{.) — '^Am(-))(L|j([0,t]),L|^([0,t]))- 

Consequently we get 
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Since the sequence is bounded in L|^([0, 1]), by (3.6.7) and [11], 

(^An(O) converges in the Hilbert space L|^([0, 1]). Let v be the limit of 
{ux^ (.)) and let, for all t G [0, 1] 

u{t) = a + / v{s) ds. 

Jo 

Then u = lim^-^cx) (•) C//([0, 1]) because the mapping / f{s) ds 

from L|^([0, 1]) to C//([0, 1]) is continuous for the norm topologies. Now we 
finish the proof by showing that u satisfies 

( u{t) G ~A{t)u{t) a.e t G [0, 1], 

\u{0)=aeH. 



By (3.6.3) we have 

= Ax^(t)ux^(t) G A(t)Jx^(t)ux^(t). 

llA„(i)wA„(0 - “(Oil 

< WJ\At)u\„{t) - WA„(i)|| + ||UA„(0 - w(0ll> 



- Ua„( 0II = A„pA„(i)wA„(<)ll 

= ^npA„(0ll 

As XnP{t) 0, from (3.6.9) and (3.6.10) we see that 
(3.6.11) \\JXr^{i)'^Xnit) - 6. 

By extracting a subsequence, we may suppose that (^a^()) converges a.e. 
to V, for H endowed with the norm topology. As the graph of A{t) = dft 
is sequentially strongly x weakly closed, from (3.6.8) and (3.6.11), we finally 
get —v{t) G A{t)u{t) a.e. Since it = v a.e, the proof is therefore complete 
because the uniqueness is straightforward, using the monotonicity of A{t) (see 
also Proposition 2.9). □ 

Now we give an application of Proposition 3.6 which allows us to recover 
the existence of absolutely continuous solutions for the convex sweeping pro- 
cess in Hilbert spaces [21]. 

Proposition 3.7. Let C : [0, 1] H be a closed convex valued multifunction 
satisfying: there is a positive number k > 0 such that, for all x,y e H and for 
all s, t G [0, 1] 



(3.6.8) 
But 

(3.6.9) 
with 

(3.6.10) 
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\d{x,C{t)) - d{y,C{s))\ < \\x - y\\ k\t - s\. 

Let a G C(0). Then there is a unique absolutely continuous solution u : 
[0, l]^H to 



\ u{t) e -d[kdc(t)]{'^{t)) a.e t G [0, 1], 

\^u{0) = a, u{t) G C{t), 'it G [0, 1]. 

Proof. By Proposition 3.6, there is a unique absolutely continuous solution 
u:[0,l]-^H to 



( ii{t) G -d[kdc(t)]{u{t)) a.e t G [0, 1], 

\u{0) = aeC{0). 

To finish the proof it is enough to apply Theorem 1.3 in [26] asserting that 
any absolutely continuous solution u(.) of the preceding equation satisfies the 
inclusion u{t) G C{t) for all f G [0, 1]. □ 

Corollary 3.8. Let C : [0, 1] — > H be a closed convex valued multifunction 
satisfying: there is a positive number A: > 0 such that, for all x^y G H and for 
all s^t e [0, 1] 



\d{x,C{t)) - d{y,C{s))\ < \\x - y\\ k\t - s\. 

Let a G C'(O). Then there is a unique absolutely continuous solution z : 
[0, 1]-^H to 

I z(t) G -Nc(t)(z(t)) a.e t G [0, 1], 
yz{0) = a, z{t) G C{t), it G [0, 1]. 

Proof. Recall that Nc(t){z{t)) denotes the normal cone of C{t) at the point 
z{t). Let z(.) be the unique absolutely continuous solution z : [0,1] ^ H to 

f z{t) G -d[kdc{t)]{z{t)) a.e t G [0, 1], 

\z{0) =ae C(0), z{t) G C{t), it G [0, 1], 

whose existence is ensured by Proposition 3.7. Then, by ([25], Proposition 
11.12), we get 

d[kdc(t)]{z{t)) C Nc(t){z{t)), 
and hence -z{t) G Nc{t){z{t)). □ 
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Abstract. Nonlinear extensions of some theorems concerning inverse-positive matri- 
ces are made. The method of proof is based on an intuitively natural geometrical con- 
sideration. 



1. Introduction 

The properties of M-matrices are explained in a detailed way by Berman and 
Plemmons[l], among which the inverse positiveness is well known and im- 
portant in various fields of sciences. In economics, the positive invertibility of 
a real square matrix is crucial in discussing the non-substitution theorem for 
models with joint production, which has been made clear in Herrero and Vil- 
lar[6]. (See also Schefold[8].) In Fujimoto et al.[5], a necessary and sufficient 
condition is presented for a real square matrix to have the nonnegative inverse 

* Thanks are due to a grant from Ministerio de Ciencia y Tecnologia, under Project 
BEC2001-0535. This paper was written while Takao Fujimoto was at Department 
of Fundamental Economic Analysis, University of Alicante, under a sabbatical year 
professorship given by the Spanish Ministry of Education. He is grateful to hospi- 
tality and warm environment there. The authors owe much to careful reading and 
the comments by the referee. 
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by generalizing a proposition in Bidard[2, Chap. 10, p.l 13], and a simple proof 
of the non-substitution theorem is given, using the duality approach by Chan- 
der[3j. In this note, we present nonlinear extensions of some theorems con- 
cerning inverse-positive matrices. The method of proof is similar to that in [5], 
and is simple, elementary, and is based on an intuitive geometrical reasoning. 

In section 2, we present our assumptions and the propositions. Then in sec- 
tion 3 another type of nonlinear generalization concerning positive inveitibility 
is given, whose linear version has not been stated explicitly so far in the lit- 
erature. Section 4 explains how this generalization covers the existing results 
as special cases, raising some numerical examples. Finally in section 5, we 
conclude with some remarks. 



2. Assumptions and theorems 

Let us start by explaining notation. The two spaces X and Y are topological 
vector spaces with their nonempty pointed closed convex cones C and D given 
respectively. The partial order introduced to X by the cone C is denoted as 
X > y ory < X when x - y e C for x^y e C. The same inequality sign is 
used for the partial order of Y introduced by D. These cones are assumed to 
have their nonempty interior intC and iniD, and we write x > y or y <C x 
when x-y einiC or intD for x^y e C or D. One more inequality sign x > y 
is used when x — yeC — {0} or x — y e D — {0}. The boundary of a set S is 
denoted as bd5. The symbol means the real Euclidean space of dimension 
n{n > 2), and the nonnegative orthant of A given transformation T 
maps X to y , and for this T we make the following assumptions. 

Assumption 2.1. There exists an x* > 0 in JA such that T(x*) > 0. 

Assumption 2.2. If T(x) > 0 for x G C, then x > 0. 

Assumption 2.3. If T(x^) > 0 and T(x^) > 0 for x^, x^ G X, then 
T{kx^ + (1 ~ k)x^) > 0 for any scalar k such that 0 < fc < 1. 

It is important to note that the assumption 2.2 is made less restrictive by requir- 
ing X to be in C. When we consider a simple Leontief model, T(x) = x - Ax, 
which maps R^ into itself with the material input coefficient matrix A being an 
n X n nonnegative matrix, the assumption 2.1 requires that the model be pro- 
ductive, while the assumption 2.2 is satisfied naturally by the absence of joint 
production. If we allow for joint production and T(x) = Bx — Ax, where B 
is the n X n material output coefficient matrix, the assumption 2.2 tells us that 
every process has to be operated if every commodity is produced in a positive 
amount, and is called essentiality condition in Fujimoto et al.[5]. The assump- 
tion 2.3 is satisfied when T is linear. More generally when either (i) the space 
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V is and each component function Tj of T is ‘strongly’ quasi-concave, 
i.e., if Tj{u) > Tj{v) then Tj{ku + (1 - k)v) > Tj{v) for any scalar k such 
that 0 < A: < 1, or (ii) T is super-additive, i.e., T{u v) > T{u) + T{v), 
and positively homogeneous, i.e., T{kx) = f{k)T{x) for a positive scalar k 
with / being a real non-decreasing function such that f{k) > 0 for A: > 0, the 
assumption 2.3 is satisfied. 

Now we are ready to prove 

Proposition 2.1. Given the assumptions 2.1, 2.2 and 2.3, if T{x) > 0 for 
X G X, then x > 0. 

Proof. Suppose to the contrary that x ^ C. Since the cone C is closed, there 
exists a scalar k such that 0 < A: < 1 and x = A:x -h (1 - A:)x* GbdC, 
where x* is the element given in the assumption 2.1. By the assumption 2.3, 
T{x) > 0, yielding a contradiction to the assumption 2.2. □ 

When the assumption 2.2 is strengthened to the following: 

Assumption 2.2*^. If T(x) > 0 for x G C, then x 0. 

we can prove 

Proposition 2.2. Given the assumptions 2.1, 2.2* and 2.3, if T(x) > 0 for 
X G X, then x ^ 0. 

Proof By Proposition 2.1, we know that if T(x) > 0, then x G C. Thus, the 
assumption 2.2* guarantees that x > 0. □ 

Now we consider one more assumption 

Assumption 2.4. If T(x) = 0 for x G C, then there exists a ^ C such that 
T{y) = 0. 

This assumption 2.4 is satisfied, for example, when T is homogeneous, i.e., 
T{kx) = f(k)T{x) for a scalar k with / being a real increasing function on 
R such that /(O) = 0. We now prove 

Proposition 2.3. Let T(0) = Oand the assumptions 2. 1-2.4 be all satisfied, 
then the kernel ofT, KerT = {x| x G X, T(x) == 0} is a singleton, consisting 
of the origin only. 

Proof Suppose that there exists an x GKerT such that x ^ 0. If x ^ C, the 
proof of Proposition 2.1 can be applied to show a contradiction to the assump- 
tion 2.2. On the other hand, if x G C, then by the assumption 2.4 we know 
the existence of y such that y ^ C and T{y) = 0, which again leads to a 
contradiction to the assumption 2.2. □ 

When T is linear with X = R^ and Y = RP, Propositions 2.1 and 2.3 
together are equivalent to the existence of nonnegative inverse of a transforma- 
tion matrix T. 
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3. Positive invertibility in a shadow land 



In this section, we discuss a nonlinear generalization of inverse-positive matri- 
ces which appear in a sort of shadow land of Leontief models. It seems better 
to start with a numerical example. Let us take up a simple Leontief model in 
which there are three commodities and three processes to produce those com- 
modities with the material input coefficient matrix A represented as 






0.1 


0.4 


0.3 \ 


0.2 


0.1 


0.2 


0.3 


0.4 


0.1 / 


E \x 


— Ax 


, which 


■0.1 


-0 


.4 


1.2 


A- 


0.1 - 


1.3 


-0 


.4 A 



0.3 



0.1 



The reciprocal of value A can be interpreted as the growth factor, i.e., H- rate of 
steady balanced growth. The Frobenius root A* of A is approximately 0.6772, 
and the eigenvector associated with this eigenvalue is (0.7069. . ., 0.4899. . ., 
0.7069. . .)', with the prime ' indicating the transposition. It is well known that 
when A > A* , T{x\ A) has its inverse with all the elements positive because A 
is indecomposable or irreducible. (See Berman and Plemmons[l].) 

What happens if A gets less than A* ? At A = 0.677, the inverse of T{x\ A) 
is roughly 



/ -1687 -2339 -1688 \ 

-1170 -1620 -1170 

\ -1688 -2339 -1687 / 

The inverse of T{x\ A) is more generally represented as 

/A^-0.2A-0.07 0.4A-I-0.08 0.3A-1-0.05 \ 

0.2A-H0.04 A^-0.2A-0.08 0.2A-H0.04 /(A®-0.3A^-0.22A-0.024) 

\ 0.3A-I-0.05 0.4A-H0.08 A^- 0.2A- 0.07/ 

Thus, when 0.4 < A < A* = 0.6772..., the negative of T{x\ A) , i.e., -T{x\ A) 
has its inverse with all the elements nonnegative. For example, at A — 0.4, 
—T{x; A) = Ax — \x is 



0.4 

-0.3 

0.4 



and its inverse becomes 



-0.3 

0.2 

0.3 



0.3 

0.2 

-0.3 
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/ 0.104... 2.5 1.770... 

1.25 0 1.25 

\ 1.770... 2.5 0.104... 

Now —T{x; A) can be rewritten as Bx — (1 -f g)Ax, where 

/ 0.1 0.4 0.3 \ / 0.4 0 0 \ 

0.2 0.1 0.2 , 0 0.4 0 , 

\ 0.3 0.4 0.1 / \ 0 0 0.4 j 

and 0 < g < 0.693 — Therefore, interpreting B as the material output co- 
efficient matrix, A as the input matrix, and g as the rate of steady balanced 
growth, our Leontief output equation Bx - {1 g)Ax = d has a nonnegative 
solution for any nonnegative final demand vector d G R\ when the rate of 
balanced growth is not less than 0 and smaller than 0.693 — Each process 
may be identified as an industry, not by its output, but by its input! 

We proceed to make a nonlinear extension of the above. Let T(x; A) = 
Ax — A{x), where A{x) is a nonlinear map from X = to with n > 2. 
We revise the assumptions 2. 1 . and 2.2, making the latter stronger, and add one 
more assumption on A{x). 

Assumption 3.1. There exists an x* > 0 in R"^ and a scalar A > 0 such that 

T(x*;A) >0. 

Assumption 3.2. 

(i) The map A{x) is indecomposable, i.e., for any x e R^ and any two 
nonempty disjoint subsets, I and J of index set X = {1, 2, . . . , n} such 
that / U J = iV, at least one element of A(x -\-y) in J is greater than the 
corresponding element of A(x) when yi > 0 for z G / and yj = 0 for 
jeJ. 

(ii) A{x) is isotone on R^, i.e., A{x) > A{y) if x > 2 / for x, ^ G R\\ 

Assumption 3.3. 

(i) A{x) is continuous on R^, and A(0) = 0; 

(ii) A{x) is quasi-convex and positively homogeneous of degree one, i.e., 
A{kx) — kA{x) for any positive scalar k. 

It should be noted that the assumption 3.2 makes the following assumption 3.2* 
valid, which is a parameterized version of the assumption 2.2*. 

Assumption 3.2*. If T(x; A) > 0 for x G R\ and for an arbitrary positive 
scalar A, then x > 0. 

In the linear case, when A is an n x n nonnegative indecomposable matrix, all 
the assumptions 3.1, 3.2, and 3.3 are satisfied. 

First we present 
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Proposition 3.1. Given the assumptions 3.1, 3.2, and 3.3, there exists a A* 
such that (i) if X > A*, T(x; A) = d has a unique solution x ^ 0 for any 
d > 0, and (ii) T{x; X*) =0 has a nonzero solution x* ^ 0, which is unique 
up to scalar multiplication. 

Proof. These are the results contained in Morishima and Fujimoto[7] and Fu- 
jimoto[4], thus the proof is omitted. □ 

Proposition 3.2. Given the assumptions 3.1, 3.2, and 3.3, there exists a X^ > 0 
such that when A^ < A < A*, — T(x; A) satisfies the assumptions 2.1, 2.2* 
2.3, and 2.4. 

Proof When A < A*, we have -T(x*; A) > 0 for x* > 0, thus -T(x; A) 
satisfies the assumption 2.1. Suppose the assumption 2.2* does not hold for 
— T(x; A) in any interval like A^ < A < A*. This implies that corresponding 
to a series {A^} , which is increasing and converging to A*, there exists a non- 
negative nonzero vector series {xi} such that at least one element of each xi is 
zero, and —T{xi;Xi) > 0. Since the dimension is finite and A{x) is continu- 
ous, we can find x** G BJf and x** ^ inxR^ such that -T(x**; A*) > 0. If 
this were T(x**; A*) =0, it would be a contradiction to the assumption 3.2 of 
indecomposability, or the uniqueness of x*, a solution to T(x; A*) = 0. If the 
inequality were T(x**; A*) < 0, this would imply by the indecomposability 
the existence of x > 0 such that T(x; A*) ^ 0. This is a contradiction to the 
property of A* that A* = sup{A| A > 0, T(x; A) < 0 for some x > 0}. (See 
Morishima and Fujimoto[7].) We can choose A^ to be positive near A*. 

Therefore, we have found that there exists an interval of A, A^ < A < 
A*, within which all the assumptions 2.1, 2.2*, 2.3, and 2.4 are satisfied for 
-T(x;A).n 

Proposition 3.3. lfA{x) is linear, i.e., a nonnegative indecomposable n by n 
matrix A, there exists a X^ > 0 such that when X^ < X < A*, —(Ax — Ax) has 
a strictly positive inverse. (X* is the Frobenius root of A.) 

Proof. This proposition is evident because Proposition 3.2 shows Propositions 
2.2 and 2.3 are valid. □ 



4 . Examples 

As in a simple Leontief model, when the spaces X and Y are the same and 
the Euclidean space of dimension n with n > 2, and the transformation is 
T(x) = X - Ax, where A is an n x n real nonnegative matrix, the assumptions 
2.2, 2.3, and 2.4 are naturally satisfied. Thus, all we have to examine is the 
assumption 2.1, the condition of productiveness. This case is well known and 
the reader is referred to Berman and Plemmons[l]. 
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The case in which T{x) = x- A{x), where A{x) is in general a nonlinear 
isotone mapping, and the spaces can be complete lattices of any dimension, 
is treated in Fujimoto[4]. The assumption 2.2 is again satisfied by the form of 
mapping itself. Proposition 2.1 can be obtained without using the assumption 

2.3 also by the very form of the map T{x) = x — A{x). 

Now we take up the linear case in which the spaces X and Y are the same 
and the Euclidean space of dimension n with n > 2, and the map is given as 
Tx = Bx — Ax, where A and B are both n x n real matrices. This can be 
interpreted as a Leontief model with joint production. The assumptions 2.3 and 

2.4 are clearly satisfied. Thus we have to check the assumptions 2.1 and 2.2 to 
be able to secure our propositions. Several numerical examples are given in 
Fujimoto et al.[5]. One of the interesting cases is when T is represented as 




Then T has the inverse 



-1 


-1 


— Ct“t“ 1 


a^—a — 2 


a^ — a — 2 


a^—a—2 


— d -\- 1 


-1 


-1 


a^ — a — 2 


a^—a—2 


a? — a— 2 


-1 


-a-Kl 


-1 


a‘^ — a — 2 


a'^ — a—2 


a^—a — 2 



when a —1 or a 7 ^ 2. This inverse is nonzero and nonnegative if 1 < 
a < 2. It is not difficult to see the assumptions 2.1 and 2.2 are satisfied for T 
when 1 < a < 2. And it is easy to generalize the above example T to a real 
square matrix having an odd number of columns and rows. When a real matrix 
has an odd number (2fc + 1 with fc > 1) of columns and rows, its diagonal 
elements are all unity, and in each row, shifting from the diagonal element to 
the right, (—a) and 1 appear alternately (jumping back to the first element at 
the rightmost one), then it is inverse-positive when 1 < a < (1 -h |). (Each 
element of the inverse contains a fraction whose denominator is (ko? - a — 
(/c + 1)), while its numerator is either ((/c-l)a-/c) or (-a-l-1).) In this case 
also, it is not hard to see the assumptions 2.1 and 2.2 are satisfied. Therefore, 
for example 





/ 1 


-1 


1 -1 


1 \ 
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1 - 


1 1 


-1 




-1 
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1 -1 


1 
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-1 


1 1 


-1 




V -1 


1 - 


1 1 


1 / 


has its inverse with all entries 


nonnegative. 





To raise an example from nonlinear cases, we consider the following 
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/ 1 -612 1 \ 

T{x) ^ ^ ^ “^23 1 X, 



where 612 and 623 now depend upon x as 



_ r 2 - \og^{x2 + l)/x2, whenx2 > 0 

~ I 1 , whenx 2 < 0 



h = i ^ 3/(^2 + 1), whenx2 > 0 

~ I 1 + ^ 3 , whenx 2 < 0 



The first component function 



T (x) = I ~ ^^Se(^2 + 1)) + X3, whenx2 > 0 

~ \ xi - X 2 + X 3 , whenx 2 < 0 

as well as the second function 



Toix) = 



f Xi -hX2 - (xs + xl/(x2 + 1)), whenx 2 > 0 
[ xi -h X 2 - (xs H- X3), whenx2 < 0 



are concave. The assumption 2.1 is satisfied by e = ( 1 , 1 , 1 )', and it is not 
difficult to see the assumption 2.2 is also fulfilled. Thus, our propositions are 
applicable. The input coefficient 612 implies the decreasing returns to scale of 
process 2 because 612 is increasing in X2 when X2 is positive, while 623 means 
the external economies rendered by process 2 on process 3 because the input 
coefficient 623 is decreasing in X2 when X2 is positive. 



5. Comments 

It should be noted that we do not depend upon the theory of determinants at 
all, and the nonnegative invertibility of matrices have been established in an 
elementary way. Moreover nonlinear generalizations have come out naturally. 

Positive invertibility in section 3 does not seem to need the special form, 
X — A(x), but can be extended to the case, B(x) — A(x), or even more generally 
to the case T(x) with some additional conditions on T. In the linear case, 
there is no problem in extending along this line. Therefore, T in section 4 of 
examples, when made negative —T, 
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has its inverse, when a = 4, 



0.1 


0.1 


0.3 \ 


0.3 


0.1 


0.1 . 


0.1 


0.3 


0.1 ) 


case, 


, -T 


is inverse-positive for any a > 2, 



converging to the 0 matrix as the parameter value a approaches infinity. 
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1. Introduction 

With K representing either the real field R or the complex field C, suppose 
f{x,y) = 0 for some function / : XxY — > where X £ mdY £ K^. 

We seek a solution, an “implicit function,” 0 on some neighborhood of x in 
which /(x, (f){x)) = 0 holds. 

Cauchy, working with complex scalars, originally assumed that / was ana- 
lytic [10], and Dini’s formulation for real scalars assumed that / was [15]. 
Similar smoothness assumptions have formed the backbone of most proofs 



We are indebted to Professor Wayne H. Richter, University of Minnesota; Professor 
Hiilya Eraslan, University of Pennsylvania; and Nevzat Eren, University of Min- 
nesota, for valuable comments on an earlier version. 
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since then. But what if / is only differentiable, rather than — and only 
at (x, y)l It will be shown that a solution 0 still exists. 

When / is differentiable at (x, y) and continuous in a neighborhood of 
(x,y), Halkin [23] showed that an implicit function does exist locally. (He 
allowed X to be an arbitrary normed linear space.) 

When g: X is differentiable at x and has a nonzero Jacobian de- 

terminant, and when it is also locally continuous, Halkin also proved that a 
right inverse exists. Further, when the Jacobian determinant of g is nonzero in 
a neighborhood, Radulescu and Radulescu [36] showed that ^ is a local diffeo- 
morphism. 

When X as well as Y are finite dimensional, we extend Halkin’s point- 
result for the implicit function theorem by weakening the continuity assump- 
tion on /. We do this with an alternative application of Brouwer’s fixed point 
theorem. We also apply the fixed point theorem with Brouwer’s theorem on in- 
variance of domain to give an alternative proof of Radulescu and Radulescu’s 
result, without explicit mention of degree theory. 

Our first two theorems weaken the assumption — to either differentia- 
bility at (x, y), or to partial differentiability with respect to y at (x, y), together 
with some continuity conditions on /. In the former case we obtain differ- 
entiability of all solutions at x; in the latter case we obtain continuity of all 
solutions at x. A third theorem combines the two frameworks. Solutions need 
not be unique. However, when / is differentiable in a neighborhood of (x, y) 
and fy is surjective there, then there is a unique solution and it is differentiable. 

Theorem 1 yields a General Inverse Function Theorem (Theorem 4), which 
in turn characterizes diffeomorphisms without continuous differentiability. 

Because implicit and inverse function theorems play a role in several parts 
of mathematics, there are many applications. 



2. General implicit function theorems 

Let K be the real field R or the complex field C, and let || • || be any norm 
on the finite dimensional Banach space over K. For any z e and real 
5 ^ 0, we denote by (z) the closed ball in centered at 2 : with || • ||- 
radius e. When we talk of continuity of functions on subsets A Q we 
mean continuity with respect to the relative topology induced on A from K^. 
We denote by Int(A) the interior of any subset Aof K^. 

We will need to bound the norm on in terms of the norms on 

and K^. We note that the values |l(x,0)|| of elements (x,0) G define 

a norm on K^. Since all norms on a finite dimensional real vector space are 
equivalent, there exists a real Pi > 1 such that: 

||(x,0)|| ^ /?i||x|| for all X G (la) 
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Analogously, there exists a real /32 > 1 such that: 

l|(0,y)|l ^ /? 2 ||y|| for ally 6 ii'''. (lb) 

Consequently: for all real C, and all x e and y € K^, 

Ilyll^ClI^II ^ ||(x,y)ll^(/3i+/32C)lN|. (2) 

If X is a subset of K'^, we denote by idx the identity function on X. 

Foru e U g and f:U-^ K'=, we say / is Lipschitz at u (for 
||^x|| ^ /i) if there is some real /i > 0 and some real 7^0 such that: 

\\f{u + u) — f{u)\\ ^ 7 ||u|| for all u G with ||u|| ^ fi. (3) 

By differentiability we will always mean Frechet differentiability. Because 
we do not always assume that the domain of / is open, derivatives, when they 
exist, need not be unique; and in that case f'iu) will denote any derivative. 

If f : X — > y for some nonempty set X £ and some set F ^ 
then we say that f:X—^ f{^) is a diffeomorphism if / is injective, f{X) is 
open, and both / and its inverse /“^ : f{X) X are differentiable. And we 
say that / is a local diffeomorphism at u G X, if there is some nonempty open 
neighborhood U ^ X of x such that f \ U f{U) is a diffeomorphism. If 
a diffeomorphism and its inverse are r times differentiable, then we say it is a 
diffeomorphism. 

While the classical proofs of implicit or inverse function theorems often 
use contraction mapping theorems, we will use Brouwer’s fixed point theorem 
for finite dimensional Banach spaces.^ (Usually stated for X = M, it holds 
for X = C as well, since every finite dimensional complex Banach space is 
isomorphic (hence homeomorphic) to for some n, which in turn is homeo- 
morphic to 

We will also use the finite dimensional Banach space version of another 
theorem of Brouwer, which follows immediately from the result [6] for real 
scalars by the isomorphism and homeomorphism just mentioned: 

(Brouwer’s Theorem on Invariance of Domain.) If A is an open 
subset of and g: A ^ is a continuous injection, then g{A) 
is an open subset of X’^, hence g is an open mapping, so g~^ is 
continuous and ^ is a homeomorphism of A onto g{A). 

Our first theorem assumes differentiability of / at a point, and yields dif- 
ferentiability of solutions at the point (in parts (a) and (b)). 

^ An earlier application of Brouwer’s fixed point theorem to an implicit function 
theorem was made in [23]. We were unaware of Halkin’s result when we wrote 
the initial versions of this paper [25]. See the details, p. 94. 
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Theorem 1. Differentiable implicit function theorem. With K = Ror K = 

C, let X be a subset of K^, let Y be a subset of K^, and let U Q X x Y. 
Let u = (x, y) G U, where x is a limit point of X, and y G Y} Suppose that 
f:U-^K^and: 



f{x, y) = 0; (4a) 

/( • , ') is differentiable at (x, y), (4b) 

with partial derivatives fx{x, y) and fy{x, y); 
fy{x^y) is surjective \ Le., (4c) 



det 





’df{x,y) 


df\x,y)' 


\ 




dyi 


dyk 






df^{x,y) 


df(z) 






. dyi 


dyk . 


/ 



^ 0 . 



Suppose that, for some ^ > \\{fy{x^y)) ^ fx{x^y)\\. for all x, y, 
X eX and y G (y) => 



(4d) 



(x, y) GU and /(x, • ) is defined and continuous on (y). 

For any ^ > 0, let X^ — (x) fl X. 

Then: 

a) There exists a real j > 0 and a function 0 G 
that: for all x G X^, 



f{x,(p{x)) = 0 
4>{x) = y. 



(5a) 

(5b) 



b) The 6 in part (a) can be chosen so that every function cj) G ^^\\x\\ (v) 

satisfying (5) is differentiable at x. 

c) For every ^ > 0, all functions 0 G Hxgx^ satisfying (5) that 

are differentiable at x have the same derivative 0'(x); 



(j)'{x) = -{fy{x,y)) ^fa;ix,y). (6) 



^ In textbook cases, X, Y, and U are open sets. Here we are not assuming that X 
even contains an open set; it could, for example, contain just x and a sequence of 
points X* — > X. 

^ By the norm of a linear transformation L, we mean II L II = max{||Lx|| : ||x|| = 1}. 
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d) If U is open, then the derivatives fx{x^y), fy{x^y), and (f)'{x) are 
uniquely determined. 

e) If X is open and if (4b, 4c, 4d) are replaced by these stronger conditions: 

f is differentiable on X xY (4e) 

fy is surjective on X x y , (4f) 

then an open neighborhood Xq x Yq ^ X x Y of(x, y) can be chosen 
so that: 

i) there is a unique function Xq Yq satisfying (5) for x G Xq; 

ii) (j) is differentiable on Xq, with: 

<p'(x) = - {fy{x, (p{x))y^U{x, 4>{x)). (7) 

f) (Classical) If X is open, and if f is on X xY for some r ^ 1, then 

there is an open neighborhood Xq x Fq o/(x, y) and a unique function 
(p: Xq Yq satisfying (5), and 0 is on Xq. 

Remark I We cannot totally eliminate the continuity hypothesis (4d). Consider 
the function / : ^ M defined by: 






x-\-y, ifx + y^Q 
x'^ + y'^, otherwise. 



( 8 ) 



The local continuity hypothesis of Theorem 1 fails, since /(x, • ) is discontin- 
uous at all {x,y) -=f (0,0) with x + ^ = 0. So for every x 7 ^ 0, the func- 
tion /(x, • ) is not continuous even on (0). Yet all other hypotheses hold: 
/(0,0) = 0, / is differentiable at (0,0), and /^(0,0) = 1. Here one cannot 
solve for ^ as a function of x 7^ 0 , since no (x,^) in R^, other than ( 0 , 0 ), 
satisfies /(x, y) = 0 . 

It is tempting, therefore, to try and prove the theorem assuming that, for 
every x, the function /(x, • ) is continuous on B^^^^{y). But our next example 
shows that is not possible: Theorem I’s requirement on 7 is “tight.” First we 
define the function g: R — ^ R by: 

g{x) = ~{x + x^), ( 9 ) 

and then we define the function / : ^ E by: 



f{x,y) 



x-g i(y), ifx - 5 ^{y) 0 

, otherwise. 



( 10 ) 



T^k 

For each x, the function /(x, • ) is continuous on Bj|^|| (0); that is clear from 
the figure indicating typical level curves. 
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Here the dashed level curve through the origin is the locus of the only discon- 
tinuities of /. Even though all other assumptions of the theorem hold, again 
there is no implicit function solving for 2 / as a function of x ^ 0: /(x, y) = 0 
only for (x,y) = (0,0). 

Remark 2 Theorem 1 replaces the classical hypotheses of Cauchy’s (com- 
plex) theorem and Dini’s (real) theorem"^ by the weaker assumptions (4b, 4d). 
Its conclusions (a,b) are weaker than the classical versions, as the implicit func- 
tions (j) are not necessarily unique or C^. The conclusion (e) is part of the clas- 
sical Implicit Function Theorems. 

Remark 3 Simple examples show that, under the hypotheses of parts (a), (b), 
and (c), the function (/> need not be unique. 

Proof of Theorem 1. Without loss of generality, we simplify the discussion by 
assuming that: 

$ = 0eK"andy = 0€ii'''. (11) 

(a) (Existence) To prove the existence assertions (5), it suffices to prove: 

3 ^ 5>o Vx 11^11^5 & ||y||^^||a:|| (x, y)eU k /(x, y) = 0. (12) 

k 

For (12) guarantees for each x e the existence of a point y e 
such that: 



o 

II 


(13a) 


X = 0 ^ = 0 


(13b) 


N1^7llx||. 


(13c) 



^ See the historical notes in Section 4. 
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r^k 

Thus we can define a function <p on taking values (j){x) = y E 
with: 

4>{x) = y (14) 



and with the properties (5).^ 

We break our proof of (12) into parts corresponding to its quantifiers, 
i) Writing: 



M = \\{fy{0,0))-^U0,0)l 


(15) 


we define: 




p 

II 

-p 

1 


(16) 


which is positive by hypothesis. 





ii) The choice of S is delicate, and will involve several steps. First note that, 
since / is differentiable at (0, 0) with partial derivatives /^(0, 0) and /^(0, 0), 
we have: for all (x, y) G x with {x,y) e u. 



f{x, y) = /(o, 0) + /a;(0, 0) ■ X + fy{0,0) ■ y + r(x, y), (17) 

where r(x, • ) is o(||(x, j/)||), i.e., 

\\'r{x,y)\\ 

ll(a;,y)|| (18) 

for all (x^y) G U with 0 ^ || (x, y) || ^ 0. 

Note that the function r(x, • ) determined by (17) is defined and continuous on 

r^k 

^i\\x\\ (^) X G X, by hypothesis (4d). 

By (4c) we can define T = (/^(0, 0))“^ 

Now for any given x G X, satisfying (12) means finding y G 5^|^l|(y) 
with /(x, y) == 0, a task that can be stated in each of these equivalent forms: 
(x, ?/) G X X y and 



f{x,y) = 0 






(19a) 


/a; (0,0) -x 


+ /y(0,0)- 


y + r{x,y) = 0 


(by (17) and (4a)) 


(19b) 


r{x,y) = - 


■fx(0,0) -X 


-fy{0,0)-y 


(rearranging) 


(19c) 


Tr{x,y) = 


-TfA0,0) 


■x-Tfy{0,0) 


• y (applying T) 


(19d) 


II 

Gs 


-TfxM 


' X — y (by definition of T) 


(19e) 


1 

II 


^ y) - Tfx{0, 0) • X (rearranging). 


(19f) 



^ The Axiom of Choice is not needed for this. For example, we could use the lexico- 
graphic ordering based on , . . . , 2/fc to choose among multiple candidates y with 
minimum norm in each of the compact sets (/(x -h x, • ))“^(0) D B^s{y)‘ 
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In (iii) below, for every x ^ X near 0, we will apply a fixed point theorem to 
find a solution y of (19f), hence of (19a). To prepare for the fixed point theorem, 
we will bound the two terms on the right hand side of (19f). 

Note that: 

The equivalences (19) hold for all x G X and y with ||^|| ^7||x||, 
since then (4d) implies that /(x, y) and thus r(x, y) are defined. 

ii.l) To bound the first term on the right in (19f), we note from (18) that: 

\\Tr{x,y)\\ ^ 

ll(a;,2/)ll (21) 

for all (x,y) eU with 0 7^ || (a;, J/) || — >■ 0, 



SO by (20): 

Ve e>0 3(5 5>o 'i{x,y) ^^(5) 

\\Tr{x,y)\\ S e\\{x,y)\\ 

^ + fi2l)\\x\\ (by (2)). 



( 22 ) 



Then for e = a/ {Pi + P21), we have: 

3(5 5>o V(a;, y) ||(3;,y)||^5 & lex & 2^)11 = Q^lkll- (23) 

Let (5 be any such number. Defining: 

5 = 5/{Pi+p2l), (24) 

we see that: if l|x|| ^ S and ||y|| ^ 7||a;|| then, 

\\{x,y)\\SiPi+p2l)\\x\\ (by (2)) 

^ iPi + P 2 l)d (25) 

= (5. 



From (20), (25), (23), and (4d) we have: 

& ||y||^7lk|| {x,y)eu k ||Tr(a;,y)|| ^ a||a;||. 

(26) 

ii.2) To bound the second term in the right hand side of (19f), note that by 
the definition of M: for all x 6 Ff", 

\\TUx,y)-x\\^M\\x\\. ill) 

iii) To complete the proof of (12) we begin by picking any x £ X with 
||x|| ^ S. We must find ay £ Y with ||y|| ^ 7||a:|| and f{x,y) = 0. This last 
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equality means finding a y equating the left and right sides of (19f). Represent- 
ing the right side through the function F : (y) defined by: 

F{y) = -Tr{x,y) -Tf^{x,y) ■ X, (28) 

we see that we are seeking a fixed point of F( • ). 

Clearly F is continuous in y because r{x, ■ ) is (by (17) and (4d)). And for 
ally with ||y|l ^ 7||x||: 

\\F{y)\\ = \\-Tr{x,y)-Tf,{x,y)-x\\ 

^\\Tr{x,y)\\ + \\TfJx,y)-x\\ 

S{a + M)\\x\\ (by (26) and (27)) ^ ^ 

-7lkll (by (16)). 



ly'k 

Thus F is a continuous function that carries the ball ^ Y into it- 

self. By Brouwer’s Fixed Point Theorem, F must have a fixed point F(y) = 
y G F^|^l|(y), and therefore (by the equivalence of (19f) and (19a)) we have 
f{x,y) = 0 . 

Using our definition (14), this allows us to define an implicit function (j) on 
(x) satisfying (5). 

Note that our construction of <p was based on a particular choice of /x(x, y) 
and fy{x,y).^ Since U is not necessarily open, these are not necessarily 
uniquely determined, and a different choice may give a different definition of 

0 . 

(b) (Differentiability at x) To prove that S can be chosen in part (a) so that 
every implicit function 0 G ^7\\x\\(y) differentiable, we first show 

that S can be chosen so that any implicit function is Lipschitz at x = 0. 

b.i) The particular implicit function constructed in our proof of part (a) 
has the Lipschitz property at x = 0, as we see from (13c) and our method of 
construction in (a.iii). By itself, however, this does not prove that all functions 
0 satisfying (5) have the Lipschitz property. To show that every function 0 
satisfying part (a) of the Theorem has the Lipschitz property at x = 0 for x in 
some neighborhood of x, we must show that: 

35 s>o V0 cf>eiM{S) 3s e>o 3ry ^>o Vx & ||a:||ge 



® Cf. the definition of M in ( 1 5), the definition of 7 at the beginning of part (a,i) of the 
proof, the definition of a in (16), the definition of r(x, y) in (17), the definition of 
5 in (22), the definition of 6 in (24), the definition of 5 in part (ii.2) preceding (27), 
and the definition of F in (28). 
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where we write ^ G IM{S) to indicate that (j) G YlxeXs (y) implicit 

mapping, i.e., 0(x) G and 0(0) = 0 and /(x,0(x)) = 0 for all 

X G Xj. 

Suppose that (30) fails, so: 



5>0 30 0GIM(<5) £■>0 77>0 ^X x^Xs Sz ||a:||^£ 

|i0(x)|| > rj\\xl 



(31) 



We will obtain a contradiction. 

For each i G N, pick a positive real 6\ with \ 0 as z ^ oc. Then 
by (31) we can choose, for each i G N: 

a function (j/ 6 IM((5*), so <f/ e 5^|1|| (y) (32a) 

a positive e'^ with \ 0 for f — > (X) (32b) 

a positive rf with rf oo for i ^ oo (32c) 

G X^i & 0 < ||x^|| < with ||0^(x^)|| > ry*||x*||. (32d) 



It follows from (32a,32d) that 7 y^||x*|| < ||0^(x^)|| ^ 711^:^11, so rf < 7 . This 
contradiction of (32c), completes our proof that every solution 0 of part (a) 
must be Lipschitz at x. 

b.ii) To prove Theorem 1(b), let 0 be any solution of part (a), and consider 
any value /a; (0,0), still assuming that x = 0 and ^ = 0. It suffices to show 
that: 



\\4>{x) - ^( 0 ) + (^( 0 , 0 )) Va;( 0 , 0 )X|| ^ 

||a;|| x€X&l\\x\\-^o 

From (b.i) we know that 0 is Lipschitz at x = 0 in some X-neighborhood of 
X, say with Lipschitz constant 7 . 

Since / is differentiable at (0, 0), 

f{x,(j){x)) = /(0,0) + fx{0,0)x + fy{0,0)(f>{x) + o{\\{x,(l){x))\\) (34) 



for all X G X, so by (1 1) and (5) we have: 

||/a,( 0 , 0 )a; + /y( 0 , 0 )</>(a:)|| 

||(x,</)(a;))|| a^GX & ||(x,,^(x))|Ho 

With /?i and /?2 as in (1), we have: 



||(a:,0(x))|l^||(a;,O)|| + ||(O,0(x))ll 

^ /3i||x|| + /32||<^(x)|| 

^(/3i + /327)lkll, 



( 36 ) 
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for all X G X near 0. So (35) implies: 

||/:,(O,O)x + /y(O,O)0(x)|| 

||x|| ^ IklHo 

Then because {fy{0, 0))”^ is linear, 



ll(/i/(0:0)) ^fxi0,0)x + (l){x)\\ 

||x|| xexk\\x\\-^o 



(37) 



(38) 



and this reformulation of (33) verifies that cj) is differentiable at x = 0. 

(c) The claim of Theorem 1(c) is clear from our proof of part (b.ii); in 
particular, the proof of (6) follows from (38). 

(d) Here X and Y can be taken as open, so the derivatives /^(x,y) and 
fx{x, y) are uniquely determined, and then parts (b) and (c) guarantee that, for 
all^, T] > 0, and all (X^, cj)) satisfying part (a) of Theorem 1, the derivative 
(j)'{x) exists, with the unique value given by (6). 

(e) Let and (p: X§ J5^l^l|(y) ^ Y have the properties of X^, and 

(f) of part (a) of Theorem l^Since we can now assume that both X and Y are 
open, we can assume that X^ is also open. 

(e.i) (Uniqueness on Xq) We will pick an open Xq ^ X^ and an open 
Fo £ ^ in order to find a unique implicit function (f). First define G : X x F — > 
X xK^ by: 

G(x,y) = (x,/(x,y)). (39) 

By hypothesis (4e), G is differentiable on X x F, and from hypothesis (4f) 
we see that the derivative G'(x, y) is surjective at all (x, y) e X x Y. Then 
parts (a) and (c.ii) of Theorem 4 (the General Inverse Function Theorem)^ im- 
ply there is an open neighborhood X<> x F> ^ X x F of G(x, y) = (x, 0) such 
that GfxoxYo is a diffeomorphism to the open neighborhood G(Xo x F<>) of 
G(x, y) = (x, 0). So there is an open neighborhood Xq ^ X^ of x such that, 
for every x e Xq there is a unique y e Y^ with G{x,y) = (x,0), i.e., with 
f{x^y) = 0. For such x we also have /(x,(/)(x)) = 0, so ^ = (j){x). Thus 
0 = is the unique implicit function from Xq into F^. Therefore defining 
Fo = Fo completes the proof of (e.i). 

(e.ii) (Differentiability on Xq) Let Xq, Fq, and (/>: Xq Fq be as in 
part (e.i) of the Theorem, and let x be any element of Xq. Define y — (j){x). 
Then: 



^ There is no circularity in our reasoning here. Our proof below of parts (a) and (c) 
of Theorem 4 will depend only on parts (a), (b), and (c) (but not (e)) of the present 
theorem. 
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/(x, y) = 0 (by definition of (j) and y)\ (40a) 

/(x, • ) is continuous on Y, for all x G X (by (4d)); (40b) 

/( • , • ) is differentiable at (x, y) (by (4e)); (40c) 

fy{x, y) is surjective (by (4f)). (40d) 



Therefore the assumptions (4a, 4b, 4c, 4d) of Theorem 1 are satisfied with (x, y) 
replacing (x, y). So the conclusions (a), (d), and (e.i) of Theorem 1 hold with 
(x, y) replacing (x, y). Thus (p: Xq Yq is differentiable at x and (7) holds 
at X = X. Since x was an arbitrary element of Xq, the claim (e.ii) follows. 

(f) (Classical) The uniqueness follows from part (e.ii); and the property 
follows from (7) when / is Of course the uniqueness and assertions, as 
well as formula (7) are part of the classical Implicit Function Theorem. □ 

Even as simple a function as this: 

f{x,y)=y-{x)i (41) 

fails to satisfy the differentiability hypothesis of the preceding theorem. And, 
while it clearly admits the unique implicit function (/>(x) = (x) ^ , that solution 
is not differentiable at 0. Indeed, the same would be true of any function 

f{x,y) = y ~'>P{x), (42) 

with ip nondifferentiable. Such examples motivate our next implicit function 
theorem, which in the spirit of Goursat [19] has a weaker differentiability hy- 
pothesis (no differentiability with respect to x) but a stronger continuity hy- 
pothesis (joint in x and y). The conclusion of our second theorem will be 
weaker than our first, asserting merely continuity of solutions rather than dif- 
ferentiability at the initial point. 

Theorem 2. Continuous implicit function theorem. With K = Ror K = C, 

let X be an open subset of and let Y be an open subset of K^. Suppose 
that {x^y) e X X Y and f : X xY ^ K^. Suppose that: 



II 


(43a) 


/( • , ') is continuous on X x F; 


(43b) 


/(x, ') is differentiable at y; 


(43c) 


fy{x, y) is surjective ; i,e., 


(43d) 





'df^{x,y) 


dfHx,y)' 


\ 




dyi 


dyk 






df’^{x,y) 


df’^{x,y) 






. dyi 


dyk . 


/ 



det 



7 ^ 0 . 




77 



Implicit functions without 



Then: 

a) There exist open subsets Xq Q X and Yq with (x,y) e Xq x Yq, 

and there exists a function Xq Yq such that: 

f{x^(p{x)) = 0 for all x G Xq (44a) 

0(x) = y. (44b) 

b) The set Xq x Yq in (a) can be chosen so that there exists at least one 
function f: Xq Yq satisfying (44), and so that every such function is 
continuous at x. 

c) (Goursat.) If assumption (43c) is replaced by this stronger condition: 

/(x, • ) is on Y for all x G X, (43e) 

then there is a neighborhood Xq x 1q of (x^y) and a unique function 
f: Xq ^ Yq satisfying (44a, 44b), and 0 is continuous on Xq. 

Remark 4 As W. H. Young showed [39], the existence result (a) for the case 
k \ only requires that / be continuous separately in x and y, rather than 
jointly. See the historical remarks below. 

Remark 5 Theorem 2(a,b) replaces the hypothesis of Goursat’s result [19]^ 
by the weaker assumptions (43b, 43c). Its conclusions in (a) and (b) are weaker 
than in Goursat [19], as the implicit functions 0 are not necessarily unique or 
continuous. Here is an example for X = M in which the uniqueness and conti- 
nuity conclusions of part (c) cannot be obtained under the non-C^ hypotheses 
of parts (a) and (b): Let /(x, y) = x — g{y), where 



g{y) = 



(y/ 2 ) + y2 sin(l/2/), 

0, 



ify^O 

Otherwise. 



(45) 



Proof of Theorem 2. Without loss of generality, we assume that: 

X = 0 and y = 0. (46) 

(a) (Existence) Without loss of generality, we assume that:^ 



/^(0,0)-id^.. (47) 

This is justified since /^(0, 0) is invertible (43d), so we could define the func- 
tion g{x,y) = /(x, (fy{0,0))~^y), which would have all the properties (43) 

® See the historical notes in Section 4., and the comparisons with results of 
W. H. Young and of Bliss. 

® Recall that id^fc is the identity function on K^. Cf. p. 67. 
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with ^^( 0 , 0 ) the identity on and g would admit an implicit function 0 if 
and only if / admits an implicit function 0 = (/^( 0 , 0 ))“ V- 

g{x,(j){x)) = 0 4^ f{x,^{x)) = 0. (48) 



To prove existence of an implicit function under (47), then, define F: X x 
Y by: 

F{x,y) = y- f{x,y). (49) 

By hypothesis (43c) and by (46) and (47), 



f{0,y) = f{0,0) + fy{0,0)y + R{y) 
= y + R{y) 



for some function R{y) of y that is o(||y||): 

mm 

II 2 /II o^iij/Iho 

Then: 



limy)ll 

Ill/ll 



\\y-f{o,y)\\ , 

Ill/ll ^ 
l|fi(i/)ll 

|| y || 0 #|| y ||-*0 



(49) and (50)) 

0 (by (51)), 



so there exists a 7 > 0 such that Bx^{y) Q Y and:'® 



V 7 o<^S7 Ilyll^T ll-^(0.2/)ll <7- 

We next show that: 



(50) 



( 51 ) 



(52) 



(53) 



^7 0<7^7 ^^7 0<£-,^£ ||x||^£ ^1/ ||y|[^7 1/) II ^ 7* (54) 

We begin by fixing any positive 7 ^ 7 . Then let e > 0 be small enough that 
Bg{x) Q X, and define the correspondence C : {s £ K : 0 % e ^ e} -» 
X X y by: 

C(£) = {(:r,y): i|x||^£ & Nl ^ 7>, (55) 

and the corresponding maximum function: 

Our proof of (53) is based on the differentiability hypothesis (43c). Alternatively, 
we could obtain (53) by simply assuming that F(x, • ) has a Lipschitz constant less 
than 1 at y = 0. That property is a weakened version of the hypotheses of Goursat’s 
lemma [19] (p. 185, §1, for = 1, n = 1, and p. 188, §3, for arbitrary /c), which 
sets the stage for his application of what is now known as the Contraction Map- 
ping Theorem. But our continuity proof in part (b) will still use the differentiability 
hypothesis (43c). 
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M{e) = max{||F(x,^)||: (x,y) eC{s)}. (56) 

Since F is continuous by hypothesis (43b), and since C{e) is clearly a contin- 
uous correspondence,^^ it follows from the Maximum Theorem^^ that M( • ) is 
a continuous function. So (54) follows from (53). 

Thus for every 7^7 and every x with ||x|| ^ the function F(x, • ) 
carries B^{0) into It is also continuous by (43b), so by Brouwer’s Fixed 

Point Theorem there exists a ^ G ^ 7 ( 0 ) such that 

F{x,y)=y, (57) 



i.e., 

f{x,y) = 0. (58) 

If a; 7 ^ 0, then we define <f){x) = y, choosing any such if x = 0 we define 
0(0) = 0. Then (44) holds, with Xq = ^^(O) and Yb = ^ 7 ( 0 ). 

We have thus shown: 

For every 7 > 0 with 7 ^ 7 , and every e ^ 0 with s ^ 

there exists a function 0 : > 10 such that (44) holds (59) 

with Xq = J5e(0) and Yq = B^{0). 

(b) (Continuity) We now prove that the set Xq x Yq containing (x, y) in 
part (a) of Theorem 2 can be chosen so that not only does there exist a function 
0: Xo ^ lo satisfying (44), but every such function is continuous at x = 0 . 
We can choose (5 > 0 with ^ 7 such that: 

ll'R(y)ll < II2/II forall y e Ba(0) (using (51)), (60a) 



and by (50) we can also ensure that: 

^ 7 ^ 0 /(O, ^) 7 ^ 0 for all y G (60b) 

Defining 7 = J, Xq = B^^ (0) and lb = ^ 7 ( 0 ), it follows from (59) that there 
exists a function 0: Xq ^ Fq satisfying (44). 

Now let 0 be any function from Xq into Yq satisfying (44). Suppose that 
0 is not continuous at 0. Then there is a sequence xi , X 2 , . . . converging to 0, 
but with 0(x0 not converging to 0(0) = 0. So there is a subsequence, whose 
elements we again denote by x^, for which 

I.e., both upper and lower hemicontinuous. Cf. Berge [1], [2], where “semicontinu- 
ity” is used for what we are calling “hemicontinuity.” 

Cf. [12], p. 889, Remark, [13], p. 19, 1.8(4), [1], p. 122, [2], p. 116. 

As in footnote 5, the Axiom of Choice is not needed. 
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By (60b) that implies: 

/(0,y)7^0, (62) 

so by the continuity hypothesis (43b) we have f{x,y)j^0 for all (x, y) in some 
neighborhood of (0,y). But defining yj = 0(xj) we also have f{xj,yj) = 0 

by the solution property (44a), and {xj^yj) — > by (61). This 

contradiction completes the proof of continuity. 

(c) (Uniqueness and continuity) If /(x, • ) is on F for all x G X, then 
the local uniqueness and continuity of the implicit function are part of Gour- 
sat’s result [19]. □ 

Remark 6 While some of the hypotheses of the Continuous and the Differen- 
tiable Implicit Function Theorems are the same, in total they are noncompara- 
ble. Parts (a) and (b) of Theorem 1 have stronger differentiability hypotheses 
than parts (a) and (b) of Theorem 2, but weaker continuity properties. While 
the function / in (41) satisfies the hypotheses of Theorem 2(a,b) but not those 
of the Theorem l(a,b), the reverse is true for the function 






y + x^ + y^, if a; is rational 
y, otherwise. 



(63) 



Here /(x, y) is not continuous in x at (0, y) for y 0, so it violates (43b) for 
(x, y) = (0, 0); but it is differentiable at (0, 0), and satisfies the other hypothe- 
ses of Theorem l(a,b). 

We have obtained implicit functions under two different hypotheses about 
the independent variable x. In Theorem l(a,b), the function /(x, y) was differ- 
entiable jointly in x and y at (x, y); and we concluded that the implicit func- 
tion 0(x) was differentiable at x. In Theorem 2(a,b), the function /(x, y) was 
only differentiable with respect to y at (x, y), and /(x, y) was continuous with 
respect to x at x; and we concluded that the implicit function c/)(x) was contin- 
uous at X. Now we show that these two results can be combined, allowing two 
types of independent variables — those satisfying differentiability hypotheses 
yield differentiability conclusions for the implicit function, and those satisfying 
continuity hypothesis yield continuity conclusions. We are interested in solving 
/(x, u;, y) = 0 for y as a function 0 of (f , w)\ /(x, (}){y^ w)) = 0. The role 

played by x in Theorem 1 is played here by v, and the role of x in Theorem 2 
is played here by w. 

Theorem 3. General implicit function theorem. With K = Ror K = C, let 

V, W, and Y be subsets of K^, K^, and respectively, where m -f p > 0; 
we allow m = ^ or p = ^, in which case V or W, respectively, is empty. 
Suppose that (v^w^y) eV xW x Y, that v is a limit point ofV, and that W 
and Y are open. Let fiVxWxY—^ K^. Suppose that: 
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f{v,w,y) = 0 ; 

f{v^ • , ') is continuous on W xY, for all v G V; 
f{-,w, • ) is differentiable at {v, y); 
fy{v^ y) is surjective ; i.e., 

/ ' df^{v,w,y) dp{v,w,yY 

dyi dyk 



df^{v,w,y) 

dyi 



df’^{v,w,y) 



For any 7 / > 0, let D V and = lnt{By{y)). Then: 

a) There exist real > 0 with Y^ ^ Y such that, for every v e 
there exists an open neighborhood of w in W, and a function 
(/)(f , • ) : Wy X Yy such that: for all w G Wy, 

f{v,w,(j){v,w)) = 0 (65a) 

(j){v, w) = y. (65b) 

b) The ^ and y in part (a) can be chosen so that every function f satisfying 
part (a) also satisfies: 

(j){v, ‘) is continuous at w (66a) 

(/){' ^w) is differentiable at v. (66b) 



c) For every ^,77 > 0, all functions (J){-,w): — > Y^ satisfying (65) 

that are differentiable at v have derivatives (j)y{v, w), determined by the 
derivatives fy(v^ y) as follows: 

i) for any value of fy{v, w,y), a value of (j)y{x) is: 

(j>v{v,w) = -{fy{v,w,y))~'^fy{v,w,y); (67a) 

ii) for any value of(py{v^ w), a value of fy(v^w^ y) is: 



fy{v,w,y) = - fy{v,w,y)4>y{v,w). (67b) 

d) If f{‘ ^ • , ') is continuous onV xW xY, then there is a neighborhood 
Vo X Wq X Fq a function f satisfying (65) that is continuous at 
{v,w). 

e) (Goursat [19], [20].) If f{ •,•,•) /^ continuous onVxWxY, and if 
f(Vj w, • ) is on Y for all (f , w) ^ V xW, then there is a neighbor- 
hood Vi X W\ X Y\ of{v^ tD, y) and a unique function f: V\X Wi Yi 
satisfying (65 a, 65b), and f is continuous on Vi x IFi. 
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Remark 7 This theorem is a generalization of the Differentiable and the Con- 
tinuous Implicit Function Theorems: when w is absent, the hypotheses and 
conclusions are those of the Differentiable Implicit Function Theorem, and 
when V is absent, they are those of the Continuous Implicit Function Theorem. 

Proof of Theorem 3. Without loss of generality, we simplify by assuming that: 

V = 0 and tD == 0 and y = 0. ( 68 ) 

Also without loss of generality, we assume that: 



/^(0,0,0)-id^.. (69) 

This is justified since /^(0, 0, 0) is invertible (64d), so we could define the func- 
tion g{v^ y) = /(t’, w, (/?/( 0 , 0 , 0 ))~^y), which would have all the proper- 
ties (64) with gy{0^0^0) the identity on and g would admit an implicit 
function f if and only if / admits an implicit function 0 = (/y( 0 ? 0 , 

g{v,w,(t){v,w)) = 0 ^ f{v,w,f{v,w)) = 0. (70) 

(a) (Existence) Define F : V x W x Y by: 



F(v,w,y) = y- f{v,w,y). 



(71) 



By hypothesis (64c), 

/(u, 0, y) = /(0, 0, 0) + /,(0, 0, 0)v F /^(0, 0, 0)y + R{v, y) 

= y + fv{0, 0, 0)t; -f R{v, y) (by (69), (64a) and ( 68 )) 



for some function R{v, y) of {v, y) that is odK'u, y)||): 

R{v,y) ^ „ 

||(t;,y)|| 0#||(t;,y)|H0 

By (71) and (72) we have: 



(73) 



il^K 0 ,y)|| = \\y-f{v,0,y)\\ 

= ||i?(v,y) + /„(0,0,0)v|| 

+ 11/40,0,0)11 INI, 

SO by (73) there exists a 7 > 0 such that, for all || (^;, y) || < 7 we have 

V(v,y) ||(,;,y )||^7 ll^(t^,0,y)|| < ^7+ ll/^<(0,0,0)|| ||v||. (75) 
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Now define: 



^0 — < 



2 ||/.( 0 , 0 , 0)11 



, if ll/,( 0 , 0 , 0 ) 11^0 






otherwise, 



(76) 



so £o > 0. Now let 



e = min{eo,7}- (77) 

Then: j j 

Vn Vj, |1 F(ii,0,!/)|| < - 7 + -7 = 7 . (78) 

Without loss of generality, we may suppose that the norm on F x y is the 
maximum norm, and then (78) and (77) imply : 



Vi' ||„||<e Vy ||y||g^ l|F(i;,0,y)|l < 7 . (79) 

Next we verify that this inequality holds not only for w = 0, but for all w 
near 0. Since F{v, 0, y) is defined for ||^|1 < ^ & ||y|| ^ 7, openness of W 
ensures there is some 5 > 0 for which F{v, w, y) is defined for all jit’ll < 5 & 
\\w\\ ^ ^ & ||y|| ^ 7. Then paralleling the proof of Theorem 2,^"^ we define: 

C{5)^{{w,y):\\w\\^6 & |b|| ^ 7 }, (80) 



for all S with 0 ^ ^ 5. Clearly C( • ) is a continuous correspondence, so by 

the continuity hypothesis (64b) and the Maximum Theorem, for every v with 
||f II < s, the function 

Mt,( 5) = max{llF(w,«;, j/)|| : ||w|| ^ (5 & ||y|| ^ 7 } (81) 

is continuous for 0 ^ (5 ^ So by (79) for every v with ||f || < e, there exists 
a (5^; > 0 with Sy such that: 

I7||<£ ||y||<7 \\F{v,w,y)\\S^. (82) 

Thus for every v with ||f || < s and every w with ||tt;|| ^ 6y, the func- 
tion F(f , • ) carries J5^(0) into ^^(0). It is also continuous by (64b), so by 

Brouwer’s Fixed Point Theorem there exists a C B^{0) such that 



F{v,w,y) = y, 



(83) 



i.e., 

f{v,w,y) = 0. (84) 

14 



Cf. (54), (55), and (56). 
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If It; ^ 0, then we define (l){v, w) = y, for any such if w = 0 and v = 0, 
we define 0(0,0) = 0. Then (65) holds with Vq = B^iO), Wy = Bs^{0) and 
Y,=B^{0). 

(b) (Continuity with respect to w) To prove part (66a), we note that the 
function defined by f{w, y) = f{0,w,y) satisfies the hypotheses of the Con- 
tinuous Implicit Function Theorem, and so by part (b) of that theorem we im- 
mediately obtain property (66a). 

(Differentiability with respect to v.) To prove (66b), we note that the func- 
tion / defined by f{v, y) = f{v, 0, y) satisfies the hypotheses of the Differen- 
tiable Implicit Function theorem, and so by part (b) of that theorem we imme- 
diately obtain property (66b). 

(c) Cf. Theorem 1(c). 

(d) (Continuity with respect to (v,w)) Part (c) follows immediately from 
part (b) of the Continuous Implicit Function Theorem, since if we define x = 

w), then all the hypotheses of that theorem apply. 

(e) (Uniqueness and local continuity) See part (c) of Theorem 2. □ 



3. A general inverse function theorem 

As usual, one can derive an inverse function theorem from an implicit function 
theorem. 

Theorem 4. General inverse function theorem. Let K = Ror K = C, and 

let X be an open subset of K^, and let g: X — > be continuous on X 
and differentiable at x E X. Let y = g{x). Suppose g'{x) is surjective, Le., 
rank(^'(x)) = n. Then: 

a) There exists an open neighborhood Y of y and a function h:Y—^X 
such that: 

g o h = idy (85a) 

h{y) = X. (85b) 

b) For every open neighborhood Y of y, every function h satisfying (85) is 
injective, and is differentiable at g{x) with: 

h'{g{x)) = {g'{x))-\ (86) 

c) If the differentiability of g at x is strengthened to differentiability on X, 
and surjectivity of g\x) is strengthened to surjectivity of g' on X, then: 



15 



As in footnote 5, the Axiom of Choice is not needed. 
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i ) For every open neighborhood Y of y, every function h: Y ^ X 
satisfying (85) is differentiable on Y and h! is surjective on Y. 

ii) An open neighborhood Xq of x can be chosen so that g fxo ^ 
diffeomorphism from Xq to the open neighborhood g{Xo) of y; in 
particular, g\xo has a unique inverse h on g{Xo), and: 

h\g{x)) = (g'(x))~^ for all x G Xq. (87) 

d) (Classical) If g is on X for some r ^ 1, then there is an open neigh- 
borhood Xq ofx such that g\xo is a -diffeomorphism from Xq to the 
open set g{Xo). 

Proof We will prove the General Inverse Function Theorem from the Differ- 
entiable Implicit Function Theorem, reversing the roles of x and y. Define 

f{x,y) = g{x) - y (88) 

for all X G X and y G so / is differentiable at (x^y). 

(a) (Existence) To prove part (a), note that because fx{x^y) = g\x) is 
surjective, Theorem 1(a) implies that there exists an open neighborhood Y of 
y = g(x) and a function t/j: Y X such that: 

f{ipiy),y) = 0 for all y G F (89a) 

^{y) = X. (89b) 

So in view of (88), h — 'll: satisfies (85); thus ft is a right inverse on Y of g. 

(b) (Injectivity) Let Y and ft be as in part (a). Then ft is injective: if h{y) = 
h{y) then y = g{h{y)) = g[h{y)) = y by (85a). 

(Differentiability at g(x).) Theorem 1(b) implies that any 'll: satisfying 
(89) is differentiable at ^ = g(x). Then by (85) the Chain Rule implies that 
g'(h{y))h\y) = the identity on y, so ft = also satisfies (86). 

(c.i) (Differentiability on Y) Assume now that g\x) exists and is surjective 
for all X G X, let Y be an open neighborhood of y, and let ft: X — > X 
satisfy (85) as guaranteed by Theorem 4(a). For any 'y eY define x = h{y) G 
X, so ^ = g{x). Then by Theorem 4(b) ft is differentiable at y, and ft'(y) = 

(c.ii) (Diffeomorphism) By Theorem 4(a) there exists an open neighbor- 
hood y of ^ and a function h: Y X satisfying (85). Then ft is injective on 
y (by (b)) and continuous on Y (by (c.i) since g is differentiable on X). So it 
follows from Brouwer’s theorem on Invariance of Domain that ft is a homeo- 
morphism from Y to the open neighborhood h(Y) of x. Thus Xq == ft(X) is 
an open neighborhood of x, and g fxo- = 9{h(Y)) = X, as a 

left inverse of ft, is also a homeomorphism; so h are inverse homeo- 

morphisms between Xq and Y. By (c.i) the inverse ft is differentiable, so g\xo 
is a diffeomorphism and (87) holds. 
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(c) (Classical) The uniqueness follows from part (c.ii); and the C'^ property 
follows from (87) when the homeomorphism is Of course the unique- 
ness and assertions, as well as formula (87), are contained in the classical 
Inverse Function Theorem. □ 

Diffeomorphism Corollary. Let K = or K = C, let X be an open subset 
of K^, let g: X K^, and let the integer r ^ 1. Then g is a local 
diffeomorphism^^ at x £ X if and only if there is some neighborhood of x on 
which g is and g' is surjective. 

Proof The necessity of local differentiability and surjectivity is obvious. 
The sufficiency is contained in part (c.ii) of the General Inverse Function The- 
orem when r = 1. For r > 1 the proof follows by induction, paralleling the 
usual proofs of the classical Inverse Function Theorem. □ 

Remark 8 

a) There are examples in which no inverse h in part (a) of the General 
Inverse Function Theorem is unique or continuous. In such examples, with / as 
defined by (88), none of the functions 0 in Theorems l(a,b), 2(a,b), and 3(a,b) 
can be unique or continuous. 

b) The continuity hypothesis on g cannot be dropped altogether, as shown 
by instances based on example (8) above. 

c) The uniqueness and differentiability properties that follow from the ad- 
ditional hypotheses in part (c) are new results in the real case K = R. When 
K = C, however, the local differentiability assumption in (c) is only appar- 
ently weaker than the classical assumptions for K = R, since for == C the 
local differentiability hypothesis of (c)^^ implies, by Goursat’s Theorem [11], 
p. 100 on analytic functions, that g is actually analytic — in which case the 
classical hypotheses and conclusions of part (d) hold. 

d) As an example of a diffeomorphism that is not C^, yet satisfies our 
hypotheses in part (c.ii) of Theorem 4 and the Corollary, consider: 



9{x) 



2a; + sin(l/a:;), for— l<a;<l 

0, for X = 0. 



(90) 



e) The theorems have useful applications, since the classical Implicit Func- 
tion Theorem is imbedded in many parts of mathematics, including differen- 
tiable manifolds and optimization theory. The weaker hypotheses of the theo- 
rems above admit the possibility of extending existing results to these and other 
areas. 

Cf. page 67. 

^^Cf. [14], p. 272,(10.2.5). 

E.g. (45) or [14], p. 273, Problem 2. 

But not those of parts (a) and (b). 
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For example, replacing the classical Implicit Function Theorem by the gen- 
eral version (Theorem 1) allows a proof of the Lagrange Multiplier Theorem 
that weakens the classical hypothesis to just continuity of the constraint 
functions and their differentiability at the extremum [26]. As another example, 
comparative statics questions for equilibrium systems^® can now be addressed 
with weaker assumptions. 



4. Some historical background and comparisons 

We originally developed the general theorems to solve an applied problem [26]. 
However, attempts to ascertain their novelty led to a historical study, and we 
present a few of the highlights that may be of interest to some readers. 

We sketch the development of the implicit function theorem, with pri- 
mary emphasis on the smoothness assumptions underlying the proofs. For the 
early period we have been guided by the information in Osgood [33], with 
Cauchy the earliest mentioned. We have not tried to examine the history prior 
to Cauchy’s work. We use the notation of the previous sections. 

There are several dimensions we could use for making historical compar- 
isons. We could discuss assumptions: real scalars or complex scalars; the num- 
ber of X and y variables; the smoothness assumptions on / with respect to x 
or y or (x, y)\ smoothness assumptions at (x, y) or in a neighborhood of the 
point. We could discuss conclusions: existence of implicit function, uniqueness 
of implicit functions, and smoothness of implicit functions. We could discuss 
methods of proof: Cauchy’s calculus of residues, Cauchy’s Calcul des limites, 
differential equations, or fixed point theorems. 

We will give a brief chronological history, highlighting major differences 
among the various contributions, comparing our three theorems above with the 
major earlier results. 

1831. Cauchy. Cauchy’s results [10] are presumably the earliest rigorous 
existence proofs of the Implicit Function Theorem. Here the underlying scalar 
field K is that of the complex numbers. He mentions the general case in the 
introduction, and gives a detailed analysis of the special case k = l,n = 1. The 
function / is assumed to be represented by a power series — i.e., to be analytic. 
A unique implicit function 0 is obtained, and it is shown to be analytic. 

Cauchy’s tools were his Calculus of Residues and his Calculus of Limits. 
Their flavor may be experienced by considering the simple case where A: = 1 
and n = 1, so that both x and y are one-dimensional. Then a presentation of 
Cauchy’s approach, in modem language looks roughly like the following. 



20 



Cf. [37], Part I. 
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Assuming / is analytic and not identically zero, for any given x, the gener- 
alized Argument Principle implies:^^ 



1 

2Tii 



L 



^ f{x,y) 



dy = Y, 

r=l 



OL'pCLf ^ 



(91) 



where the ar are zeros of /(x, • ) and the ar are their multiplicities, and where 
C is a closed rectifiable curve not passing through any root ar. Since the roots 
of an analytic function that is not identically zero are isolated, one can pick a 
small enough circle C in the complex plane about the root ai of /(x, • ) so that 
there are no other zeros in the circle. In particular, Cauchy considers the case 
that the root ai has multiplicity Then denoting ai by 0(x), (91) becomes:^^ 



1 

27Ti 



L 



^ f{x,y) 



dy = (p{x). 



(92) 



In view of the continuity of / and its derivatives, Rouche’s Theorem^"* implies 
that for all nearby x the number of zero’s (counting multiplicities) of the func- 
tion /(x, • ) remains the same; so there is a unique local solution function (/), 
and it can be calculated by the integral formula (92). 

Cauchy further verifies that (j) is (locally) an analytic function — that it is 
represented by its Maclaurin series. First he applies the calculus of residues 
to calculate derivatives for the Maclaurin series. Without loss of generality as- 
suming X = 0 and (j){x) = 0:^^ 







^ / yAm{y)dy (defining A^) 

27TI Jq 

-^\2m Res{yAm{y)] 0)) (by the Residue Theorem) 
2tti 



1 , d'"-! 

777 lim 7 

(m — 1)! y^o dy'^~^ 



{{ynyAmiy)). 



(93) 



where the last equality follows^^ since yAm{y) has a pole of order m. Then he 
shows that the series converges to (j). 

Cf. [11], p. 124, Theorem 3.6. 

When the usual assumption is made that fy{x, y) is surjective, it follows that the 
unique root within the circle is a simple root. 

Cf. [10], p. 76 (52). Cauchy expresses y in polar coordinates, so his integrand has 
additional factor y. 

Cf. [11], p. 125, Theorem 3.8. 

Cf. [10], p. 83, equation (97). 

Cf. [1 1], p. 1 13, Proposition 2.4. 
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Of course this is an anachronistic account of his methods: the generalized 
Argument Principle, Rouche’s Theorem, etc., had perhaps not yet crystallized 
as named theorems, and Cauchy established them in the context of his proof. 



1852. Cauchy. In his memoir [9], Cauchy used an alternative approach for 
proving the Implicit Function Theorem, namely existence theorems for dif- 
ferential equations. Again the underlying scalar field is that of the complex 
numbers, and the function / is analytic (“synectic”). The implicit function (p is 
unique and analytic. It is explicitly assumed that fy{x, y) is nonsingular. 

Although he mentions more general cases, formal proofs are given only 
for the case where x is one-dimensional but y can be multi-dimensional (i.e., 
n = 1 and k ^ 1). In this situation Cauchy is able to apply existence theo- 
rems for ordinary differential equations to establish existence, uniqueness, and 
differentiability of the implicit function (p. 

We can illustrate the logic of his approach again for the simple case 
f{x,y) = 0 with both x and y one-dimensional. Assume^^ that / is and 
that fy (x, y) ^ 0. In that case, the right hand side of the ordinary differential 
equation 

dy ( 94 ) 

dx fy{x,y) 



with the initial condition f{x,y) = 0 is well defined and has a unique local 
solution, say y — 0(x), satisfying y = (p{x). It can be shown that the function 
(p is the unique function locally satisfying /(x, (p{x)) = 0 with y — 0(x); i.e., 
it is the implicit function for the given function /. Furthermore, its derivative 
satisfies the relation 









fy{X,(j){x)) 



(95) 



Thus we have obtained the standard Implicit Function Theorem (for this sim- 
ple case), but at the cost again of assuming the given function / to be C^. 
Hence it could have been used to obtain Dini’s result for a system involving 
a single independent variable. Perhaps an analogous approach using existence 
theorems for partial differential equations could be used when there are several 
independent variables, i.e., when n ^ 1. 

Theorem 1 above weakens the assumptions of both Cauchy papers by re- 
ducing the smoothness requirements on / (from analytic in a neighborhood to 
differentiable at a point). Although Theorem I obtains existence of an implicit 
function and asserts that all implicit functions are differentiable, it cannot claim 
uniqueness of the implicit function. 



A lucid exposition of the methods he used in [10] for establishing (92) is contained 
in [30]. 

This logic applies both to the real scalars and also to the complex scalars (in which 
case is equivalent to analytic). 
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1875. Briot and Bouquet.^^ Briot and Bouquet in their [5] are clearly 
aware of and following Cauchy’s work in the theory of complex variables. The 
scalars are complex, and / is analytic. In Section 21 1, p. 336, Theorem III, they 
state and prove the theorem for the special case = 1, n = 1; in Section 212, 
p. 337, Theorem IV, they state and sketch an argument for the more general 
case fc ^ 1, but still with n = 1. 

They use solution methods for ordinary differential equations to establish 
existence of an analytic implicit function. The uniqueness of the implicit func- 
tion is implied by their results on ordinary differential equations (an explicit 
uniqueness argument is given in Section 209, p. 332, for the case of a single 
ordinary differential equation). 

1877-78. Dini. In his lecture notes [15], Dini states the Implicit Function 
Theorem in the form found in most textbooks today. The scalars are real, and 
/ is (or for some r ^ 1). He solves for a unique implicit function 0, 
which he proves is (respectively, C^). 

At that time, of course, Brouwer’s Fixed Point Theorem [7] was not avail- 
able.^^ Instead, Dini used the Intermediate Value Theorem together with a 
assumption. To indicate the essence of his approach, consider the simple case 
where both x and y are one-dimensional and there is only a single given func- 
tion / and a single equation /(x, y) = 0. 

In this context, Dini assumes that, in some neighborhood of (x,y), the 
function / is and furthermore, that the partial derivative fy{x,y) 0. 
His proof of the existence of an implicit function y = 0(x) then uses two 
propositions of differential calculus: the Intermediate Value Theorem and Tay- 
lor’s Formula with Remainder (Extended Theorem of the Mean) in the form 
f{x -\-h,y^k) = /(x, y) -f hfx{x -h <9/i, y -h Ok) -f kfy{x -f 6>/i, y -f Ok), with 
0 < 6 > < 1 . 

Thus even to establish existence, this proof uses the existence of the deriva- 
tives of / with respect to x and y in a neighborhood of (x, y). Furthermore it 
uses the continuity of the partial fy to show that, in a sufficiently small neigh- 
borhood, it retains the same sign as at (x, y), and the continuity of fx is also 
used. 

The first printed version of Dini’s implicit function theorem is in Genoc- 
chi’s [17], as edited by Peano. (The German translation is [18].) A clear for- 
mulation of Dini’s theorem and a proof is also found in [34], with a helpful 
diagram. 

Theorem 1 weakens Dini’s assumptions by allowing both real and complex 
scalars, and by weakening the smoothness requirements on /. Although Theo- 



(We have not had access to Briot and Bouquet’s first edition of 1859. (Osgood [33] 
mistakenly dates the 2nd edition as 1873).) 

The paper itself is dated July, 1910. 
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rem 1 asserts that all the implicit functions are differentiable at (x, y), it cannot 
claim uniqueness of the implicit functions. 

1893. Jordan. In Sections 91-95 of [27] the scalars are real, and the results 
do not seem to go beyond those of Dini. The general case k ^ l,n ^ 1 is 
considered, / is and a unique implicit function 0 is obtained. Jordan notes 
that its partial derivatives exist, and his discussion makes clear that they are 
continuous. 

In Section 191 , page 178, the scalars are complex. The case /c = 1, n ^ 1 is 
considered in detail; the possibility of an extension to /c ^ 1 is mentioned. The 
function / is analytic (“synectic”), and a unique implicit function 0 is obtained, 
and it is shown to be analytic, with the usual formula for its partial derivatives. 

A clear proof along the same lines, for fc = 1, n = 1 is also to be found in 
[34], p. 345ff. 

Our Theorem 1 weakens the analytic hypothesis on / to differentiability at 
(x, y), while retaining existence of an implicit function and differentiability of 
every implicit function at (x, y). It cannot, however, guarantee uniqueness or 
analyticity of the implicit function. 

1899. Lindelof. In [29] the scalars are complex. The general case k ^ 
1, n ^ 1 is considered. The function / is assumed analytic, an implicit function 
0 is obtained, and it is shown to be analytic. 

The proof is based on power series expansions of / and the implicit func- 
tion 0, using Cauchy’s Calcul des limites, in contrast to Jordan (1893), which 
reduced the complex case to the real case. 

Our Theorem 1 weakens Lindelof’ s assumption that / is analytic to differ- 
entiability at (x, y) and local continuity, while retaining existence of an implicit 
function. While it obtains differentiability of every implicit function at (x, y), 
it cannot claim uniqueness. 

1901. Osgood. This contains a clear statement of an implicit function the- 
orem for complex scalars, with A: ^ 1, n ^ 1, and with an explicit statement of 
uniqueness of the implicit function. We have also found this to be a very useful 
source for historical information; see especially pp. 19 (footnote 30) and 103 
(footnote 247). 

1903. Goursat. In [19] the scalars are real, and n ^ 1 and k ^ 1. Goursat 
makes no differentiability assumption with respect to x. He assumes that / is 
continuous in a neighborhood of (x,y), and he retains the assumptions that 
/(x, • ) is for X near 0. He obtains a unique implicit function, and shows 
that it is continuous. 

Goursat’s proof used a fixed point theorem, what we would now call a 
contraction mapping theorem, which he proved using Picard’s method of suc- 
cessive approximations. Defining 
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g{x,y)=y~y-{fy{x,y)) ^f(x,y) (96) 

he shows that g{x^ y) = y has a unique solution for y = 0(x) that is a contin- 
uous function of x?^ 

Theorem 2 weakens Goursat’s assumptions by allowing both real and com- 
plex scalars, and by weakening the smoothness requirements on /. Although 
Theorem 2 obtains existence of an implicit function and asserts that all implicit 
functions are continuous, it cannot claim uniqueness of the implicit functions. 

The Goursat result (dropping differentiability of / with respect to x, and 
assuming continuity of / near (x, y)) is also found in Bliss’s published lecture 
[3], pp. 8-9 on the “Fundamental Existence Theorems.” Although Bliss cites 
Goursat’s paper, his method of proof is based on Taylor’s Theorem. 

In [22] Graves notes that existence, uniqueness, and continuity can be ob- 
tained without assuming any differentiability of / with respect to the variable 
X. His remark (p. 139) is in the midst of his proof of his Theorem 2 (p. 138). 

1909. W. H. Young. There are several theorems to consider in Young’s 
[39]. In all of them, the scalars are the reals. 

Young’s Theorem 9 modifies Dini’s result, replacing the differentiabil- 
ity assumption on / with an assumption that / is differentiable of order r for 
some r ^ 2. It asserts the existence of a unique implicit function 0, which 
is also differentiable of order r. Neither our Theorem 1 nor Dini’s theorem 
require second order differentiability for existence of an implicit function. 

Our Theorem 1 uses weaker assumptions than Young’s Theorem 9 in allow- 
ing both real and complex scalars, and it uses weaker smoothness requirements 
on / (differentiability at (x, y), instead of twice differentiable at (x, y), hence 
in a neighborhood). Although our Theorem 1 obtains existence of an im- 
plicit function and asserts that all implicit functions are differentiable at x, it 
cannot claim uniqueness of the implicit functions. 

Young’s Theorem 10 has a weaker assumption on / than Dini’s theorem, 
since it requires a lower order differentiability of / with respect to y (differ- 
entiability in a neighborhood of (x, y), rather than C^). However it strength- 
ens Dini’s assumption that fy{x, y) is nonsingular (hence nonsingular in some 
neighborhood, since / is for Dini), by imposing nonsingularity conditions 
on certain principal minors (in some neighborhood). Young then concludes that 
there exists a unique implicit function 0, and it is differentiable at (x, y). 

Young remarks that only the uniqueness of 0 is affected if one weakens 
the assumptions of his Theorem 10 by only requiring that the Jacobian and 
principal minor conditions hold at (x, y), rather than throughout some neigh- 
borhood.^^ 

He notes that the solution is unique provided that g is Lipschitz continuous with 
constant AT < 1, observing that this automatically holds when fy is continuous in 
a neighborhood of (x, ^). (See also footnote 10, p. 78 above.) 

Page 420, paragraph 16. 
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Our Theorem 2 uses weaker assumptions than in Young’s Theorem 10 in 
allowing both real and complex scalars, in only requiring continuity of / lo- 
cally and differentiability at (x, y) (rather than differentiability locally), in not 
requiring Young’s conditions on principal minors of the Jacobian in a neigh- 
borhood (it only requires that the Jacobian itself be nonsingular at (x, y)). Our 
Theorem 2, however, cannot claim uniqueness of the implicit functions. 

Young claims (in his Corollary 3 of Theorem 10) that Dini’s result is a 
corollary of his Theorem 10, although that is not clear to us. 

In neither Theorem 9 nor Theorem 10 does Young consider a situation in 
which / is not differentiable with respect to x, in contrast to Goursat’s result 
and our Theorem 2. However, his Theorem 5, as extended in his section 10, 
contains a special case that does not require differentiability. For the special 
case fc = l,n ^ 1, assuming only that fy{x,y) is nonzero, and that / is 
continuous separately in each argument, he obtains in parts (1) and (2) the 
existence of an implicit function 0. (While there need not be a unique implicit 
function, he notes that there exists an upper semicontinuous one and a lower 
semicontinuous one.) 

The basic idea of Young’s proof is similar in spirit to what others have 
done. Because it yields a better result for the special case k = 1, n ^ 1 than 
either his general Theorems 9 or 10, or our Theorem 2, we present here a 
proof sketch. (We weaken his continuity assumption with respect to x by only 
requiring continuity at x.) 

Because /(x, ^) = 0 and /^(x, y) ^ 0, there exists a y such 
that /(x, y) > 0 and /(x, —y) < 0. Because /( • , y) is continuous 
at X, then for all small enough x we have both /(x -f x, y) > 0 
and /(x + X, -y) < 0. Then continuity of /(x + x, • ) and the 
Intermediate Value Theorem imply /(x-f-x,y*) = 0for some y* 
in the interval [— y, y]. By continuity of /(x-fx, • ), the set of such 
y* is closed, so one way to define the value of an implicit function 
at X is to chose the maximum such y*. 

Other. Without aiming at completeness, we mention a few other treatments 
of the implicit function theorem. Careful statements of the classical (Dini) ver- 
sion are found in Bolza [4] and Caratheodory [8]. 

In the complex domain, Goursat [20], pp. 399 ff., [21], pp. 233 ff. and 
Osgood [35], §§45, 105, [35], pp. 86-86, and Markushevich [31] all utilize 
the Weierstrass Preparation Theorem [38], pp. 107-114.^^ This theorem deals 
with cases where the equation /(x, y) = 0 has roots y(x) whose multiplicity 
m may equal or exceed 1, and is applicable to cases where /^(x, y) = 0, and in 
this respect has broader applicability than the results established in the present 
paper. When m = 1, the Preparation Theorem can be used to prove the Implicit 

Published in 1886, but presented in lectures since 1860. 
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Function Theorem for the case where fy{x^ y) ^ 0 and / is analytic. This is 
shown most explicitly by Markushevich [31], p. 109, Theorem 3.1 1. 

Evgrafov’s proof [16] uses series expansions and Cauchy’s integral for- 
mulae, as well as the Frechet property of the differential. In addition to prov- 
ing existence, uniqueness, and continuity, it also derives the formula 0'(x) = 
-fy{x,y)f{x,y). 

The paper by Hildebrand and Graves [24] was perhaps the first to use 
Banach space techniques to establish the Implicit Function Theorem through 
methods applying to both the real and complex domains. 

Nijenhuis [32], building on a notion of Leach [28], obtained inverse and 
implicit functions in the real case, under an assumption (“strong differentiabil- 
ity”) weaker than but stronger than Frechet differentiability. 

1974. Halkin. Halkin eliminated the hypotheses of earlier authors, us- 
ing differentiability, continuity, and Brouwer’s fixed point theorem [23], The- 
orem D. Unaware of his earlier result, we embarked on a similar quest, using 
very similar techniques. Though similar, some of our results differ from those 
of his theorems in a few significant respects. 

We do not specify that the function / in Theorem 1 is continuous in a 
full neighborhood of (x, ^); we only require continuity with respect to the y 
variable. In fact, we weaken even that continuity considerably, allowing con- 
tinuity requirements to depend on the distance of the x variable from x. And 
in Remark 1 we show that those continuity requirements are “tight.” Halkin’s 
Theorem D, however, assumes full continuity jointly in both variables, in a full 
neighborhood. 

Halkin’s Theorem D allows the space X to be a normed linear space, rather 
than the finite dimensional linear space we assume. (It appears that such a 
generalization of our result would be possible with only slight modifications of 
our proof.) 

Our Theorem 2 is essentially the same as Halkin’s Theorems A, B, and C, 
which do not require differentiability in the x variable. 

Halkin does not have a parallel of our Theorem 3, which generalizes our 
Theorems 1 and 2. 

Finally, our inverse function Theorem 4 is essentially the same as Halkin’s 
Theorem G, and our proof obtains it from the implicit function theorem by the 
same standard techniques that he uses. 

1989. Radulescu and Radulescu. Our inverse function Theorem 4 is es- 
sentially the same as Radulescu and Radulescu’s Theorem 3.4 [36]. Their proof 
uses topological degree theory instead of the Brouwer fixed point theorem we 
use. While they do not state an implicit function theorem, that could be derived 
from their inverse function theorem by standard methods, but it would impose 
stronger continuity requirements than in our Theorem 1. 
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Abstract. Constrained optimization problems are central to economics, and Lagrange 
multipliers — when they exist — play a basic role in solving them, in theory and in 
practice. Examples are well known of optimization problems for which multipliers do 
not exist. So it is important to know what requirements constraint functions must satisfy 
to be “Lagrange regular,” i.e. to guarantee existence of multipliers for broad classes of 
maximand or minimand functions. We relax the requirements in three directions: 

1. We reduce the smoothness requirements on constraints. This allows weaker and 
more uniform hypotheses for mixed inequality and equality constraints, permit- 
ting, for example, just differentiability at the optimum and continuity in a neigh- 
borhood. (We allow much weaker hypotheses, as well.) 

Beyond smoothness, other requirements have long been imposed on con- 
straint functions, to avoid simple examples lacking multipliers. We examine two 
types of such “constraint qualifications”. 

2. We provide new, relaxed constraint qualifications of both Jacobian and path 
types. 

a) Our Jacobian constraint qualifications (23), (24), (25) permit spanning 
properties as alternatives to the usual rank restrictions. 



* For valuable comments on an earlier version, we are indebted to Professor Kam- 
Chau Wong, Chinese University of Hong Kong; Professor Hiilya Eraslan, Univer- 
sity of Pennsylvania; and Nevzat Eren, University of Minnesota. 
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b) Our path constraint qualifications (69), (72), (73) impose fewer restrictions 
than before on the directions permitted for constraint derivatives. (The log- 
ical relationships are indicated in (149)). 

3. Our relaxed requirements are not only sufficient for avoiding many well-known 
counterexamples — they cannot be weakened further: 

c) We formalize a notion of minimality for Jacobian constraint qualifications, 
and prove that ours are minimal for “Lagrange regularity.” 

d) We prove that our path constraint qualifications are necessary for “La- 
grange regularity.” 

The tool enabling us to relax smoothness requirements on equality constraints is our 
Non-C^ Implicit Function Theorem, p. 142. (See also the reference in Section 8. to 
Halkin’s work.) 

Since we are weaving together many strands, the following outline may be helpful. 



1. Introduction: Constraint qualifications, definitions page 98 

2. Constrained maximization page 100 

Lagrange regularity page 101 

Fundamental Lemma page 104 

3. The Jacobian criterion page 106 

4. The Jacobian criterion is sufficient page 1 10 

Theorem 1 page 110 

Theorem page 114 

5. The Jacobian criterion is minimal page 120 

Theorem 2 page 120 

6. The Tangency-path criterion page 124 

Theorem 3 (Sufficiency) page 127 

Theorem 4 (Necessity) page 132 

Weaker than Jacobian criterion page 132 

7. Mathematical Appendix page 138 

8. Historical comments and comparisons page 143 



Key words: Constrained optimization, Lagrange, Kuhn-Tucker, non-C^ analysis, min- 
imal constraint qualification, Jacobian Criterion, Tangency-Path Criterion 



1. Introduction 

Lagrange-Kuhn-Tucker multipliers are important theoretical and practical tools 
for solving many types of constrained optimization problems. It is well known, 
however, that such multipliers do not exist for all problems — that certain 
requirements have to be imposed on the constraints, in order to be assured that 
every maximand or minimand function will generate multipliers. In this paper 
we seek to reduce those requirements as much as possible in two directions: 
smoothness requirements, and “constraint qualifications.” 

(1) Smoothness of constraints. While the classical Lagrange Multiplier 
Theorem assumes smoothness of equality constraints and a (rank) Jacohian- 
type condition, Kuhn and Tucker reduced that to differentiability for inequal- 
ity constraints, using a path-type condition. Is a similar reduction possible 
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for equality constraints or mixed constraints? As shown below, the answer 
is yes.^’^ For Jacobian-type constraint qualifications we show that it can be 
reduced to differentiability at just the optimizing point, together with partial 
continuity. For path-type constraint qualifications we reduce the smoothness to 
just existence of partial derivatives. 

(2) Constraint qualifications. For optimality problems with inequality 
constraints, it has been known since Karush that some “constraint qualifica- 
tions” were required to guarantee existence of “Lagrange multipliers.” In the 
Karush-Kuhn-Tucker inequality theorems, these were properties of paths ly- 
ing in the constraint set. For problems with equality constraints, correspond- 
ing qualifications were (rank) properties of the constraint Jacobian, implicit 
already in Lagrange. For both Jacobian and path conditions, our results are 
achieved under weaker assumptions than found elsewhere. We also introduce 
a notion of minimality: our Jacobian condition is minimally sufficient for exis- 
tence of Lagrange multipliers; finally, we prove our path condition is necessary 
and sufficient. 

Type 1 results: Theorems 1, 1+, 3. (Weak smoothness hypotheses and 
weak constraint qualifications sufficient for Lagrange regularity.) Suppose that 
u e IFT maximizes the real valued function / subject to g ^ 0 and h = 0. 
Then under either Condition (a) or Condition (b) below, there exist Lagrange 
multipliers A ^ 0 in and in IR^ such that: 

f\u)'^ + X^g'{u) + i/'h'{u) = 0, (la) 



i.e., such that: 



9f .. . 9g^ ^ dg^ 






dh^ 

...■+■ /j.k-^{u) =0 
OUj 



(i = 1, 



(lb) 



n). 



Condition a) /, g and h are differentiable at fi, and h in addition satisfies cer- 
tain partial continuity conditions there, and the Jacobian {g' (u) , h' (u)) 
has certain properties. 



^ Halkin [16] has an earlier result in this direction. Although there are many points of 
similarity between his approach and parts of ours, his results concern only “quasi- 
regularity” (p. 103 below), rather than the full regularity of our Theorems 1, L^; 
that much weaker and less informative conclusion enables him to avoid all mention 
of constraint qualifications, which are the main focus of all our theorems. Also, his 
smoothness assumptions are more restrictive than those of our Theorem 1^(A,B). 
See also footnote 29 and the Historical Notes and Comparisons in Section 8.. 

^ We restrict attention to constraints and optimizing functions that have certain types 
of derivatives at the optimizing point. We do not consider subdifferentials, which 
would become more natural if we restricted attention to constraints that were con- 
cave or convex functions. 
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Condition b) /, g, and h admit partial derivatives, and satisfy a certain “path- 
type” constraint qualification. 

Note that the “constraint qualifications” (a) and (b) refer only to the con- 
straint functions g and h, not to the maximand /. When multipliers exist for 
g and h for a whole class T of functions /, we say that (p, h) is “Lagrange 
regular” for T. 

Type 2 results: Theorems 2, 4. (Minimality of constraint qualifications.) 
To guarantee Lagrange regularity of (^, K) for a class T of maximands: 

i) The properties listed in Condition (a) are necessary for Lagrange reg- 
ularity, if we restrict ourselves to conditions expressed in terms of the 
Jacobians of the constraint functions, i.e. in terms of partial derivatives. 

ii) The properties listed in Condition (b) are necessary for Lagrange regu- 
larity. 

Our relaxations of constraint qualifications listed in Conditions (a) and (b) 
are precisely those needed to make the constraint qualifications minimally suf- 
ficient in (a), and necessary and sufficient in (b), respectively. In other words, if 
Condition (a) were violated, then the existence of Lagrange multipliers could 
not be decided by looking at only the Jacobians of g and h. And if Condi- 
tion (b) were violated, then the existence of Lagrange multipliers could not be 
decided by looking at only the constraints g and h : one would have to look at 
the particular maximand / as well. 



2 . Constrained maximization 

For simplicity, we study just constrained maximization, since the minimization 
of a function / can be studied in terms of maximizing -/. And we discuss 
nonnegativity constraints ^ ^ 0, since nonpositivity constraints can be studied 
through ^ 0. 

We seek to maximize a real valued function / on a set [/ g J7?^, subject 
to conditions g^ ^ 0, .... g"^ ^ 0 and = 0, ... = 0, where the g'" and 

h'^ are also functions from U to IR^. Either g or h, or both, may be absent. To 
avoid trivialities we assume throughout that n > 0. 

Let g = {g^^. . . ^ g'^) and h = (/i^, . . . , h^), and define the constraint set 
by 

C{g, h) = {x eU:\fi g\x) ^ 0 & V j h^x) = 0}. (2) 

A point G [/ is said to maximize f on C{g^ h) if: 

u e C{g,h) 

uec{g,h) f{u) ^ f{u). 



( 3 ) 
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When g is absent, this is a problem of classical mathematics; when h is 
absent, it is known in economics as a “Kuhn-Tucker” problem. We write C{g) 
when h is absent, and C{h) when g is absent; the context will avoid ambiguity. 
When both g and h are absent (so there are no constraints), then C{g^ h) = U. 

2.1 Lagrange regularity 

Suppose that u e U maximizes f on C{g^h) and that /, g, and h have partial 
derivatives at u. We are interested in the existence of “Lagrange multipliers,” 
A G IR^ and ii G iR^, satisfying: 

f{u)'^ + X'^g'{u) + i/'h'{u) = 0. (4) 

In addition, we will want A to satisfy a nonnegativity condition. But simple 
examples show that there may exist no A and g satisfying (4). Consider:^ 

g^{ui,U2) =U2-ul^0 
g‘^{ui,U2) = ~U2 

or: 

h^{ui^U2) = U2 — Ui = 0 
h^{ui^U2) = —U2 — Ui = 0. 

In each case the constraint set is {0}, so any function f{ui^U 2 ) has a con- 
strained maximum at iZ = 0. But f{ui^U 2 ) — u\ admits no A or /i satisfy- 
ing (4). 

So special assumptions are needed to guarantee existence of multipliers 
A and g. To state them, we first explain the notion of a “binding” inequality 
constraint. 

Binding constraints. An inequality constraint g^ is binding at u if g^ {u) ^ 
0 for all j = 1, . . . , m, and if u belongs to the boundary of the “individual” 
constraint set {x G IR ^ : g'^{x) ^ 0}."^ 

With U ^ with u e U and g: U IR^, we partition M = 
m} to distinguish the g^ that are binding at u from the others:^ 

^ This is analogous to Slater’s example in [38]. 

^ The binding property refers to the boundary of the domain of g\ not of its range. 
Thus need not be binding at u even if g"' {u) = 0 (e.g. if =0 in an open 
neighborhood of g\u)), and conversely, g^ can be binding at u even if g\u) > 0 
(e.g. \fU = {{ui,U 2 ) : ui 2}, and ^"(^ 1 ,^ 2 ) = ui ^ U 2 , and {ui,U 2 ) = 

( 1 , 1 )). 

Note that if g^ = g^, then each constraint may be binding at u. 

^ One could introduce an analogous distinction between binding and nonbinding 
equality constraints, but we shall not do so. 
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I = {i £ M: uis in the boundary of {u eU: g{u) ^0 k g\u) ^ 0}} 

J = M\I 

(7) 

(either /, J, or both may be empty). When / or J is nonempty, we write, 
respectively: 

gi-.U^IHP 

( 8 ) 

g=^igi,gj):U^lRPx iR— ^ 

Without loss of generality, we assume that the components of gi are the first p 
elements of g: gj = {g^, , gP). 

Regularity. Now we seek conditions under which the fact that u £ U Q 
IR^ maximizes / on C(p, h) implies the existence of a A € and difi £ IR^ 
such that:^’^ 

/'(u)'^ + X^g'iu) + p^h'{u) = 0, (9) 



A ^ 0 (10a) 



and 



Aj = 0 if g'^ is not binding at u (i.e., i £ J). (10b) 

Rather than solving (9) and (10) for A and /i, given particular and /, 
one may ask whether the given functions g'^ and allow us to solve, for each 
f in some family T that is maximized on C(^, h) at u, for such A and /i. (We 
shall always assume that each j £ T has partial derivatives at u.) So we say 
that (^, K) is Lagrange mixed-regular for (tZ, J^) if:^ 

i) all the g^ and all have partial derivatives at u\ 

® It is tempting to suppose that (10) implies Ai > 0 whenever g^ is binding at u. 
Simple examples show that is not true. Consider this example in IR^: f{x) = -x^, 
g{x) = X, and li = 0. The constraint g is binding at the maximizer u = 0, but 
A = 0 is required by (9). 

^ When f/ is open (as we assume in Parts 4 and 5), then if g^ is binding at u it follows 
that g^{u) — 0. So for open [/, condition (10) implies (though is not implied by) 
this “complementary slackness” condition: 

Ai ^ 0 Kg\u) = 0. 



The terminology is modified from [4]. We really should call this Lagrange regu- 
larity for {U, u, T) \ for brevity, we omit the reference to U. (Alternatively, we can 
note that U is determined by T.) 
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ii) for every function f ^ T,\iu maximizes f on C{g,h) then (9) and (10) 
hold for some (A,/x). 

When h is absent, we say that g is Lagrange inequality-regular for {u, T) if: 

i) all the g^ have partial derivatives at u\ 

ii) for every function j ^ T,\iu maximizes / on C(^) then: 



f{uf + \^g'{u) = 0 



(H) 



and (10) hold for some A. 

And when g is absent we say that h is Lagrange equality -regular for {u, T) if: 

i) all the have partial derivatives at u\ 

ii) for every function / G JF, if u maximizes / on C{h) then: 

f'{uY 4- h'{u) ~ 0 (12) 



for some g. 

When for each f e T there are unique A and g satisfying (9) and (10), we say 
that (^, h) is Lagrange mixed-regular for (iz, !F), uniquely', and we use similar 
uniqueness definitions for the inequality and the equality cases. 

We often refer simply to Lagrange regularity; the context will make the type 
clear. When both g and h are absent, we say that we have Lagrange regularity 
for {u, T) if: for all f ^ Tfxiu maximizes f on U then f'{u) = 0. 

It is important to note that Lagrange regularity is a property of particular 
constraint functions g and h — not just a property of the constraint set C{g, h). 
We demonstrate this in Remark b.l.iii, page 126 below, and we note some 
implications in the Removable Constraints section, page 109 below. 

Quasi-regularity. A weaker notion of regularity, apparently dates back to 
Hilbert (for equalities). We say that (g^h) is Lagrange quasi mixed-regular for 
(ft, J^{u)) if the properties of Lagrange regularity hold except that conditions 
(9) and (10) are replaced by the weaker condition: there exists a nonzero vector 
(Ao, A, g) such that 



Ao/'(m)^ + AV(w) + M^/i'(w) = 0, (13) 

Ao ^ 0 (14) 

A^O. (15) 

We assumed that partial derivatives of the g\ hf and / exist at u, even when 
not binding, in order to give meaning to (9). We could avoid assuming that the 
nonbinding g^ possess partial derivatives at u, by using (16) and (17) below in 
place of (9) and (10). This would be more general, but it would be less convenient 
to apply where the structure of the constraint set is not known a priori. 
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Similar definitions apply for inequalities and for equalities.^ 

In this paper we focus on regularity, as it is more informative than quasi- 
regularity. However, Lagrange quasi mixed-regularity implies Lagrange mixed- 
regularity, under the special rank assumption (31) below. This follows from the 
Fundamental Lemma below, on purely algebraic grounds. 

Now we seek A and /x that satisfy (9) and (10) for all such /. We begin by 
restating the problem in a simpler form with a natural 

Reduction. Let g: U ^ IBJ^, h: U and f:U-^lR have partial 

derivatives at Suppose that the binding constraints are gj = {g^,. . . ,gP). 
Then it is clear that a solution A G IR^ and /x G IR^ to (9) and (10) exists if 
and only if there exists a A G IR^ and sl /jL e IR^ such that: 

f\u)'^ + X^g'j{u) + fi^h'{u) = 0 , (16) 



and such that: 

A ^ 0. 



(17) 



2.2 The fundamental lemma. Constraint qualifications 

Although our basic maximization question is one of analysis, it is useful to 
view it algebraically. Our answers all rest on the following simple consequence 
of the Theorem of the Alternative. 

Lemma 2.1 (Fundamental lemma). Let U ^ IR^, and let u e U be a limit 
point. 

A) Suppose f:U IR, g: U — > and h: U IR^ are functions on 
U that have partial derivatives at u, with g{u) = 0 and h(u) — 0, and 
suppose there does not exist (A, /x) G IRF x IR!^ satisfying the following 
two conditions: 

i) 

+ x^g'{u) + fjFh'{u) = 0, (18) 

ii) 

A ^ 0. (19) 



^ Halkin [16] used an approach similar to those of [9], [5], [6], and [24], using Jaco- 
bian conditions to establish Lagrange quasi mixed-regularity under weaker smooth- 
ness hypotheses than those predecessors. 

Actually, the g^^^ . ,g^ need not have partial derivatives, since they don’t ap- 

pear in (16), and they effectively don’t appear in (9) since the corresponding Xi 
vanish according to (10b). 
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Then there exists a z e IR^ such that: 



g'{u)z Z 0 


(20a) 


II 

o 


(20b) 


f'{u)z > 0. 


(20c) 



B) When g is absent, part (A) holds if we delete references to, and terms and 
relations involving A or g. 

C) When h is absent, part (A) holds if we delete references to, and terms 
and relations involving p or h. 

Proof This is an almost immediate application of the Theorem of the Alterna- 
tive A ^ □ 

The lemma is an algebraic statement about derivatives. It does not mention 
maximization, but now we show why it is basic to maximization theory. 

Suppose that /, g, and h are differentiable at u, which maximizes / on 
the constraint set C{g,h). If {g, h) is not Lagrange regular, then the lemma 
implies that (20) holds. If there were a path u{ • ) with values u{t) lying in 
C{g^ h), starting at u and with derivative li'(O) = 2, then^^ we could rewrite 
(20) as: 



^g{uit))\t=o = g'{u)z ^ 0 


(by (20a)) 


(21a) 


^h{u{t))\t=o = h'{u)z = 0 
at 


(by (20b)) 


(21b) 


jp{u{t))\t=o = f(u)z > 0 


(by (20c)). 


(21c) 



So small movements along the path would keep us in the constraint set but 
increase the value of /. And that would contradict the assumption that u max- 
imizes / on the constraint set C(^, /i). 

Thus algebra highlights the role of analysis. As (5) and (6) show, Lagrange 
regularity requires some restrictions on the constraint functions g and h. As 
the algebra suggests, a sufficient restriction is the existence of paths u{') with 
u'(0) = 2 for any 2 satisfying (20). For historical reasons, such hypotheses are 
typically called constraint qualifications. 

Two general types of constraint qualifications have been developed. The 
first simply asserts that any 2 satisfying (20a,b) is indeed the derivative of some 
path lying in C{g^h). We will call these path conditions. 

See the Appendix. 

If h, and / were differentiable at u. 

Cf. [30]. 

Slater [38] proposed a constraint qualification that does not fall into the two types 
we discuss. It was intended, however, for application to the Saddle Point Equiva- 
lence Theorem, assuming concavity of the constraint functions g\ rather than for a 
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The second type is computationally oriented, being expressed as algebraic 
properties of the Jacobian matrix {g'{u), h'{u)). They may involve its nonsin- 
gularity, the span of its rows, or other algebraic aspects. We call these Jacobian 
conditions. 

In simple examples (e.g. (5) and (6)) checking whether a path condition 
holds can be rather easy. In more complicated examples, it may be difficult. By 
contrast, the Jacobian conditions that we discuss are decidable in an algorith- 
mic fashion, as we will see later. 



3. The Jacobian criterion 



Our Jacobian condition will guarantee that every i satisfying (20a,b) (or the 
corresponding variant when ^ or ft, is absent) is the derivative of some path 
lying in C{g, ft) (or C{g) or C{h)). In Theorem 1 we prove that the condition 
is sufficient for Lagrange regularity. Then in Theorem 2 we prove it is as weak 
as possible in the class of sufficient Jacobian conditions. (It is generally not 
as weak, however, as the non- Jacobian path conditions in Theorems 3 and 4, 
which are both necessary and sufficient for Lagrange regularity.) In order to 
make these notions precise, we use the following terminology. 



Sufficient and minimal Jacobian constraint qualifications. Let U g 

IR^, and let ^ G [/ be a limit point. Let ^ be a set of functions g: U iR^, 
let 7Y be a set of functions h: U IR^, and let T he 3. set of functions 
f:U IR. Assume that all functions in Q, H, and have partial deriva- 
tives at u. 

By a Jacobian mixed-constraint qualification for (iZ, H) we mean a 

G ^ & ft G 1~C\ 

of {p k) X n (real) matrices. To each property Q corresponds the set Q of 
matrices with that property. 

We say that a Jacobian mixed-constraint qualification Q for is 

(ZZ, , J^)-sufficientfor Lagrange mixed-regularity if: for all p G ^ and ft G 

W, 



property 



of some members of the set { ( • 9 



^ G Q ^ (p, ft) is Lagrange mixed-regular for (u, T). (22) 

We say that a Jacobian constraint qualification Q for (ZZ, is minimally 
(ZZ, Q, , J^)- sufficient for Lagrange mixed-regularity if: 



proof of Lagrange regularity. Because of the convexity of the constraint set defined 
by such g\ Slater’s condition implies a path-type constraint qualification of the type 
we discuss below. (See the historical comments in Section 8. below.) 



g'liu) 

\h'(u) 
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i) Q is {u, Q, H, JF) -sufficient for Lagrange mixed-regularity in the sense 
of (22); 

ii) no weaker Jacobian property (i.e., no proper superset of Q) 



sufficient for Lagrange mixed-regularity: if g e Q,h e H, and 




Q, then there is some g e G and some h e H with g'j{u) = g\{u) and 
h\u) — h'{u) for which (p, h) is not Lagrange mixed-regular for {u, J^). 



We may abbreviate “minimally sufficient” to “minimal.” 

Analogous definitions of sufficiency and minimality apply with functions g 
for the inequalities problem, and with functions h for the equalities problem. 

Whether a Jacobian condition is minimally sufficient for Lagrange regular- 
ity clearly depends on the set U, the element u, and the classes Q and H of 
equality and inequality constraint functions under consideration, as well as on 
the class T of maximand functions. To simplify discussing these classes, some 
notation is useful. 

In the following list, U is a. subset of IR^. For any ft G t/ we make these 
definitions: 



Td {u) is the set of functions f.U^IR that are differentiable at u. 
is the set of functions g: U ^ IR^ that are differentiable at 



Hp{u) is the set of functions h: U ^ that have partial deriva- 
tives at u. 

Hd{u) is the set of functions h: U IR^ that are differentiable at 
u. 

RDciu) is the set of functions h: U ^ IR^ that are differentiable 
at u and continuous in some neighborhood of u. 

Hc^ i'u) is the set of functions h: U IR^ that are at w. 

For example, to say that a Jacobian equality-constraint qualification Q is 
(u, W/ 5 c(fi) 7 -^D(w))-sufficient means: for all h G Hdc{u)^ if h'{u) G Q 
then h is Lagrange equality-regular for {u, Td{u))‘ 

The rest of this section states a new Jacobian constraint qualification, the 
Jacobian Criterion. Section 4. will prove it is sufficient for Lagrange regularity 
with respect to some important classes G, H, and T. And Section 5. will prove 
it is minimally sufficient with respect to those same classes (and to other classes 
as well). 

The Jacobian criterion. Suppose that u G [/ g and that^^ gi‘U-> 
IRF and h: U IR^ have partial derivatives at u. Let a(l), . . . , a{jp) be the 
rows of the p X n Jacobian matrix g'j{u), and let 6(1), . . . , b{k) be the rows of 



15 



I is defined in (7). 
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the fc X n Jacobian matrix h'{u). We allow p = 0 or /c = 0, or both; i.e. gj or 
h may be absent. 

A) (Mixed: Equalities and inequalities) We say {g, h) satisfies the (Mixed- 
Problem) Jacobian Criterion at u if one of the following two mutually 
exclusive conditions (a), (b) holds: 

a) rank(/i'(rZ)) = k < n, (23a) 

and there exists a E IR^ such that: 

g'liu)^ > 0 

h'm = 0; 



b) wedge(a(l), . . . , a{p)) + span(6(l), . . . , b{k)) = (23b) 

i.e.: 

{v e IR^: 3t zeR^ H H tpa{p)-\- 

B) (Inequalities) We say g satisfies the (Inequality-Problem) Jacobian Cri- 
terion at u if one of the following two mutually exclusive conditions 
holds: 

a) I is nonempty and there exists a ^ G IRJ^ such that: (24a) 

g'i{u)i > 0; 

b) wedge(a(l), . . . , a(p)) = IR^, i.e.: (24b) 

{^’ G IR ^ : 3t o^teRp H 

C) (Equalities) We say that h satisfies the (Equality -Problem) Jacobian 
Criterion at u if one of the following two mutually exclusive conditions 
holds: 

a) rank(/i'(u)) = k <n\ (25a) 

b) rank(/i'(iZ)) = n, (25b) 

i.e.: 

span((6(l),...,6(/c)) = iR^, 

i.e.: 

{i? G IRJ ^ : 3z zCiR^ ^ ~ d" ’ * * 4" Zp(i(k)^ = IRP . 



Remarks.!. 

Our interpretation is that if p = 0 = A: then (A), (B), and (C) all hold; if 
k = 0 then (A) is equivalent to (B); and if p = 0 then (A) is equivalent to (C). 
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(A. Mixed) Alternative (23a) is similar to Mangasarian’s generalization^^ 
of the Arrow-Hurwicz-Uzawa Constraint Qualification.^^ The “obvious” alter- 
native (23b) is new. We will see that the Jacobian Criterion is sufficient for 
mixed Lagrange regularity. And we will show it is a minimal Jacobian condi- 
tion. 

For the mixed case, the Jacobian Criterion requires that either the sum of 
the wedge and span is “small,” with the wedge lying strictly on one side of 
some hyperplane and the span lying in it, or else it is “large,” spanning the 
whole space. In fact, that proves parts (a) and (b) are mutually exclusive. 

(B. Inequalities) When only inequalities are present, the Jacobian Criterion 
requires that either the wedge generated by the g'j{u) is “small,” lying strictly 
on one side of some hyperplane, or else it is “large,” spanning the whole space. 
Again, that observation proves parts (a) and (b) are mutually exclusive. What 
the Criterion rules out are the “borderline” cases where neither is true — i.e., 
where whenever the spanned convex cone lies on one side of some hyperplane, 
it does not lie strictly on one side. Examples (5) above and (68) below are such 
examples, with ^^'(0) and ^^'(0) pointing in opposite directions. 

(C. Equalities) For analogy with cases (A) and (B), we have broken condi- 
tion (25) into two parts. Together they are equivalent to the single rank hypoth- 
esis of the classical Lagrange Multiplier Theorem: rank(/i'(w)) = min{A:, n}\ 
see Corollary Ic below. 

Removable constraints. In many applications, Lagrange multipliers satis- 
fying (4) are sought as a step in finding the constrained maximizing point u. If 
the Jacobian Criterion fails, all is not lost. For it may be that some constraint 
can be removed without changing the constraint set C near u\ and the Jacobian 
Criterion might be satisfied for the reduced set of constraints. In such situations 
(e.g. — /i^), theorems using the Jacobian rank conditions should be applied 

to the rank of a “maximally reduced” set of constraints.^^ 



Mangasarian’s “modified Arrow-Hurwicz-Uzawa constraint qualification” in [33], 
pp. 172-173, which requires that /i be Ck His condition with its rank = k re- 
striction is equivalent to our part (a), with its rank = k < n restriction (since the 
existence of such a as in (23a) requires k < n); our two parts (a) and (b) together 
are weaker, however, than his modified Arrow-Hurwicz condition. 

Cf. the hypothesis of Theorem 3 in [4]. 

This is an application of our remark earlier, page 103, that Lagrange regularity 
depends on the particular constraint functions g and /i, and not just on the constraint 
set C{g,h). Cf. Condition Ri in [2], p. 8, where g^ represents a reduced set of 
constraints. (Page 162 in [3].) Cf. Remark (6.1.iii) below, page 126, for an analogue 
with path conditions where Lagrange regularity depends on the constraint functions, 
rather than on the constraint set. 
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4. The Jacobian criterion is sufficient 

The Jacobian Criterion is sufficient for Lagrange regularity. 

Theorem 1. Let U ^ IR^ be open, and u ^ U. 

A) (Mixed: Equalities and inequalities) The Jacobian Criterion (A)^^ is 
{u, G d{u) ,TtDc{u) ^ ^D{u))-sufficient for Lagrange mixed-regularity?^ 
In other words: 

Suppose g: U IR^ is differentiable at u, and h: U IR^ 
belongs toHDc{y)- If the Jacobian Criterion (23a, b) holds for (^, h) at 
u, then (^, h) is Lagrange mixed-regular for (iZ, 

Then if u maximizes f subject to g ^ and h = D, there exists a 
A G IR^ and a p £ IR^ such that: 



f\u)'^ + X^g'{u) + p^h\u) = 0 (26) 



A ^ 0 (27a) 

Xi = 0 if ieJ. (27b) 

B) (Inequalities)^^ The Jacobian Criterion (B?^ is {u^GD{ll)^?^D(?j))- 
sufficient for Lagrange inequality -regularity. In other words: 

Suppose g: U IRJ^ is differentiable at u. If the Jacobian Cri- 
terion (24) holds for g at u, then g is Lagrange inequality -regular for 

{u,!Fd{u))‘ 

Then if u maximizes f subject to g^t), there exists a X e IR^ such 
that:^^ 

f{u)'^ + x^g'{u) = 0, (28) 

A ^ 0 (29a) 

Aj =0 if jeJ. (29b) 



Page 108(A). 

In particular, the smoothness conditions on the functions are satisfied if h is differ- 
entiable in a neighborhood of u, and / and g are differentiable at u. Sufficiency is 
defined in p. 106. 

In this case g is present and h is absent. Equivalently, we can set /i = 0 in part (A) 
delete references to, and terms involving, p or h. 

Page 108(B). 

J is defined in (7). 
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C) (Equalities)^"^ The Jacobian Criterion (Cf-^ is 

sufficient for Lagrange equality -regularity and is {u^Hp{u),J^p{u))- 
sufficient when k — n. In other words: 

Suppose that either h G TiDc{h), or that h has partial derivatives 
at u and that k — n. If the Jacobian Criterion (25) holds for h at u, then 
h is Lagrange regular for (u, 

Then ifu maximizes f subject to h = D, there exists a p G IR^ such 

that: 

f\u)^ + p^h'{u) = 0. (30) 



Our proof of Theorem 1 begins on page 117. 

We can immediately derive special cases of Theorem 1, by taking advan- 
tage of the “trivial” second parts, (23b), (24b), (25b), of the Jacobian Criterion. 
As an obvious special case of Theorem l.A, for example, we have: 



Corollary la. Mixed: Equalities and inequalities. Let U ^ be open and 
let u G U. Suppose g G Gd{u) h G Hdc{u)’ re. g: U IR^ and 
h: U IR^ are differentiable at u, and h is continuous in a neighborhood of 
u. Let g^ ^ ,g^ be the binding inequality constraints at u. If: 



rank 



5 /(“) 

h'(u) 



= p-Gk, 



(31) 



then {g, h) is Lagrange mixed-regular for (u, !Fd{u)), uniquely.^^ 

Proof. Rank condition (31) implies that rank(/z'(ii)) = /c, and also that 
(^^(^)) so there are many satisfying the Jacobian condi- 

tion (23a), hence the corollary follows from Theorem l.A. The rank condition 
also guarantees uniqueness of solutions (A,/x) to (^) = 

hence to (26) and (27). □ 



As an obvious special case of Theorem l.B, we have: 

Corollary lb. Inequalities. Let U ^ IRJ^ be open and let u G U. Suppose 
g C QD{u)y so g: U ^ IR^ is differentiable at u ; and suppose g^^. .. ,g^ are 
the binding constraints at u. If: 

rank(^^('u)) = p, (32) 

then g is Lagrange inequality-regular for (iZ, uniquely. 

In this case h is present and g is absent. Equivalently, we can set ^ = 0 in part (A) 
and delete references to, and terms involving, Xor g. 

Page 108(C). 

In particular, the smoothness conditions on the functions are satisfied if h is differ- 
entiable in a neighborhood of u, and / is differentiable at u. 

Cf. [36], Theorem 3.1, p. 180. 
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Proof. Rank condition (32) implies that gj{u) is onto so there are many 
^ satisfying the Jacobian condition (24a), hence the corollary follows from 
Theorem l.B. The rank condition also implies that the solution A to g\(uY\ = 
—/'('iZ) is unique. □ 

As an obvious special case of Theorem l.C, we have: 

Corollary Ic. Equalities. Let U ^ IR^ be open and let u E U. Suppose 
h G HDcifi), so h: U IR^ is differentiable at u, and continuous in some 
neighborhood of u. If: 

rank(/i'('u)) = mm{k,n} (33) 

(i.e. va,nk{h'{u)) = k < n or rank(/i'(fZ)) = n), then h is Lagrange equality- 
regular for {u^Hdc{u)), uniquely. 

Proof If rank(/i'(iZ)) = k < n, then (25a) holds, so the corollary follows 
from Theorem l.C. If the rank equals n, then the Jacobian condition (25b) 
holds. In either case, the rank condition guarantees uniqueness of the solution 

to h'{u)'^p = -f\u). □ 

Remark 4. 1 on Corollaries. 

i) It is important to note that the rank conditions (31) and (32) refer to 
the Jacobian of only the binding constraints. Indeed, the conditions are weaker 
than the conditions that would result if we replaced g'i{u) by g'{u). For if the m 
rows of g'{u) are linearly independent, then so are the p rows of the submatrix 
gj{u), a fortiori; and the converse fails. 

On the other hand, conditions (31) and (32) are stronger than the conditions 
that would result if we replaced p + k and p by min{p + k, n} and min{p, n}, 
respectively, even though (31) and (32) imply p 4- k ^ n and p ^ n, respec- 
tively. Indeed, the corollaries would fail under the weaker conditions. Consider, 
for example, this modification of the well known Kuhn-Tucker example [29], 
p. 484: 

g^{ui,U 2 ) = -ul - U 2 

g‘^(ui,U2) =ui (34) 

g^{ui,U2) ^U2. 

Here C{g) = {(O^O)} and all three constraints are binding. Although 
rank(^'(0, 0)) = 2 = n = min{p, n}, the rank is less than 3 = p. Not 
only does our condition (32) fail, but so does Lagrange regularity (consider 

f{ui,U2) = Ul). 

ii) Conditions (31), (32), and (33) are stronger than necessary, as simple 
examples show.^^ 

For example, if g^{ui,U 2 ) = ui U 2 and g^ = g^ , then g = ( 5 ^ 1 , ^ 2 ) is La- 
grange regular at (0, 0), even though rank condition (32) fails. Cf. the comments on 
removable constraints in pp. 103, 109, and 126. 
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Remark 4.2 on Theorem 1. Although the constraint set C{g^h) need not be 
open, we cannot drop the assumption that the domain U of g and h is open. 
Consider the function h defined by: 



h{ui,U2) = Ui U2 - I (35) 

on the nonnegative quadrant of IR^. Even though h is continuous and differen- 
tiable at (0, 1), and satisfies the Jacobian Criterion (25) there, h is not Lagrange 
equality-regular there (the function /(ui, U 2 ) = ui is maximized at (1, 0), but 
no Lagrange multiplier exists there). However, when we switch from Jacobian 
conditions to path conditions, the openness assumption will be dropped; see 
Theorem 3 and the remark following it. 

(A. Mixed) This theorem has weaker hypotheses than most earlier results, 
in two respects: the Jacobian Criterion is a weaker constraint qualification (al- 
lowing the new “obvious” alternative (23b)), and the differentiability condi- 
tions on the equality constraints are weaker smoothness assumptions.^^ 

The continuity hypothesis on h can be weakened significantly, as indicated 
in the Continuity remark and Theorem 1“^ below, but it cannot be dispensed 
with altogether, as shown by the example in that remark. 

(B. Inequalities) Part (B) solves a traditional problem, considered in Karush 
[25], in Kuhn and Tucker [29], and in Arrow, Hurwicz, and Uzawa [4]. The 
Jacobian constraint qualification (24) is weaker than the Jacobian constraint 
qualifications in previous work (allowing the new “obvious” alternative, (b)). 

(C. Equalities) This is the classical Lagrange Multiplier Theorem, except 
we have weakened the usual hypothesis on h to differentiability at u and 
continuity in a neighborhood.^^ This allows, for example, = 0 as an equality 
constraint, where: 



h{Ui,U2) 



U2 - u\ sin(l/ui), if 7^ 0 

< 

^ 0 otherwise. 



(36) 



As another example, consider: 

g^{xi,X2) = 6 — 3x1 — 3x2 ^ 0 
p^(xi, X 2 ) = 6 - 4xi - 2x2 ^ 0 
p^(xi, X 2 ) = 6 — 2xi — 4x2 ^ 0 , 

which are binding at (2, 2), with rank(5f/(2, 2)) < 3 = p; so property (32) fails. 
Yet p/ (2, 2)^ > Ofor^ = (1, 1), sop is Lagrange regular at (2, 2) by Theorem LB. 
The theorem holds for TLdc{u) instead of just for Hc^ (^)- Halkin [16] used sim- 
ilarly weak conditions on g and h, obtaining Lagrange pwa5/-regularity as a some- 
what weaker conclusion; see the Historical Comments and Comparisons in Sec- 
tion 8.. 

In particular, the theorem holds for constraints in Hdc{u) instead of Just for those 
in (^))- See the preceding footnote. 
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even though it is not at the origin. 

When k ^ n, the Lagrange multiplier vector /x of (30) is unique.^^ 

Remark 4,3 on continuity of h. 

i) The differentiability and continuity condition /i G Wdc in the Mixed and 
in the Equalities Problem is weaker than usual (cf. footnote 29). Note that when 
k = n neither differentiability nor continuity of h is required in Theorem l.C, 
but only the existence of partial derivatives. For the general case, however, 
continuity cannot be dropped completely, as the following “nonsubstitution” 
Example 4.1 shows. 

Example 4.1. The function h: IB? — > IR\ 



h{x,y) 



x + y, ifx + ^7^0 
ifx + y = 0 



(37) 



is differentiable at (0, 0), with /i'(0, 0) = (1, 1). And it satisfies the Jacobian 
rank condition (25a). Yet it is not Lagrange regular for even the class of linear 
functions, even though it is differentiable at the origin. For its zero level set 
allows no “substitution,” since the constraint set C{h) contains only the origin 
(0,0); so if we define the function f:lR JR by f{x,y) = y, then / is 
maximized on C{h) at (0, 0), but no real p can satisfy: 

l + ^= ^(0,0) + /X 1^(0, 0) 



0 

0 . 



(38) 



The hypothesis of Theorem l.C that fails here is continuity: for every x, 
the function h is discontinuous at h{x^y) when y = —x 0, hence h is 
discontinuous in every neighborhood of (0, 0). 

ii) We can, however, weaken the continuity hypothesis on h significantly. 
When the Jacobian Condition holds, so h'{u) has full rank fc, we can partition 
IR^ so that u = (x^y) ^ X X Y = IR^~^ x IB?, with rank(/iy(x, y)) = k. 
Then as the proof below of Theorem 1 below makes clear, it is only continuity 
on Y that is needed. Indeed, the proof shows that continuity on Y is only re- 
quired for a sequence of x^ converging to x, and only in a certain neighborhood 
whose diameter diminishes with ||x^ — ^||- 

Theorem 1+. Let U Q IB^ be open, and u eU. 



Cf. Caratheodory [9], p. 177, Bliss [6], p. 210 for k < n. When k ^ n, rank 
condition (25) guarantees that the k rows of the matrix corresponding to h'{u) are 
linearly independent, so (30) can be uniquely solved for p. 
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A) (Mixed: Equalities and inequalities) U —^IR^andh: U 

IR^ are differentiable at u, and one of the following two mutually exclu- 
sive conditions (a), (b) holds: 

a) there exists a factoring (X, Y) = 1R^~^ x IR^ and a real 7 with: 

vdink{hy{x^y)) — k < (39a.i) 

\\{hy{x,y))~'^h:,{x,y)\\ < 7 , (39a.ii) 

and for any x e X: 

there exist real P \ 0 for which h{Px, • ) is continuous on 
B^\\t 3 x\\{y), cind there exists a ^ ^ IR^ such that: 

g'j{u)^ > 0 , 

h'{u)^ - 0 ; 

b) wedge(a(l), . . . , a{p)) + span( 6 (l), . . . , b{k)) = IR^ (39b) 

i.e.: 

{v e ]R^: 3t ogieiRp zeJR^ ^ = ^iR(I) H K tpa{p) 



Then {g^h) is Lagrange mixed- regular for {u^Xd{u)). Then ifu max- 
imizes f subject to g ^ 0 and h — 0, there exists a X ^ IRf^ and a 
/i G IR!^ such that: 

ffu)^ + X^gfu) + ii^h'{u) — 0, (40a) 

A ^ 0, (40b) 

Ai = 0 if i e J. (40c) 

B) (Inequalities)^^ This is the same as in Theorem 1(B): The Jacobian 
Criterion (B)^^ is {u, Q o{u) , Xd{u))~ sufficient for Lagrange inequality- 
regularity. 

C) (Equalities)^"^ Suppose h: U IR^ is differentiable at u, and one of 
the following two mutually exclusive conditions (a), (b) holds: 

a) there exists a factoring (X, Y) = x IR^ and a real 7 with: 

rank(/iy(x, y)) — k < n, (41a.i) 

\\{hy{x,y))-^h^{x,y)\\ < 7 , (41a.ii) 

In this case g is present and h is absent. Equivalently, we can set = 0 in part (A) 
and delete references to, and terms involving, y or h. 

Page 108(B). 

In this case h is present and g is absent. Equivalently, we can set ^ = 0 in part (A) 
and delete references to, and terms involving, Xor g. 
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and for any x £ X: 

there exist real P \ ^ for which h{Px^ • ) is continuous 
on 

b) iank{hy{x^y)) = k = n. (41b) 

Then h is Lagrange regular for (it, Td)- So ifu maximizes f subject to 
/t = 0, there exists a £ IR^ such that: 

f{uf -\-y^h'{u) = 0. (42) 

Remark 4.4 on continuity, i) (Weakening continuity) Theorem 1 used the same 
continuity assumption on the equality constraint functions h as Halkin [16], 
but obtained a stronger conclusion. Theorem 1+, on the other hand, weakens 
the continuity assumption on h while retaining our full regularity conclusion. 
In the following example. Theorem 1"^ guarantees Lagrange regularity, even 
though it allows some discontinuity in every neighborhood of the constrained 
maximum. 

Example 4.2. For the open disk U = {{x, y) £ IB? : + 2/^ < 1}, we define 

the function h: U — > iR by: 



h{x,y) 



X + y, if X 7^ 0 
y + y"^^ otherwise. 



(43) 



This constraint function h satisfies the conditions of Theorem 1“^, but not those 
of Theorem 1. First, it is easy to see that h is differentiable at (0,0). Sec- 
ond, /i(x, y) = 0 if and only if x + y = 0, and for any such x and y we 
have dh(x, y)/dx == 1 = dh(x, y)/dy. Third, constrained maxima of /(x, y) 
clearly correspond to unconstrained maxima of /(x, -x), and at such points 
we have df{x,y)/dx = df{x,y)/dy. Thus p, = —1 satisfies the Lagrange 
equality (42) at any maximizing point (x, y). 

ii) (But not too far) It is tempting to weaken further the continuity condition 
on h, by changing the strict inequalities (39a.ii) and (41.aii) to weak inequali- 
ties. That would only require continuity of /i(x, • ) on Rpja,||(y). But our next 
example shows that the Lagrange regularity result of Theorem 1“^ would fail. 

Example 4.3. First we define the function g: IR IRhy\ 

g{x) = -{x + x^), (44) 

and then we define the function h : IR^ — > IR by; 



h{x,y) 



x-g ^{y), ifx -5 ^{y) ^0 

x^ + y^, otherwise. 



(45) 
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The function h is 
holds, since: 



differentiable at the origin, and the Jacobian Criterion (25) 






dh 

dy 



(0,0) 



=: 1 

=: 1 . 



(46) 



And ||(/i^(0,0))-'/i,(0, 0)11 = 1. 



y 




Here the dashed level curve through the origin is the locus of the only 
discontinuities of h. So for each x, the function /i(x, • ) is continuous on the 
y-ball 5^1^11 (0) for 7 = ||(/^y(0, 0))“^/ia,(0, 0)||, though not for any greater 7. 

And h is not Lagrange regular at (0,0), since the constraint set C{h) is 
the singleton origin, and the function f{x,y) = y is maximized there, but 
cannot satisfy the regularity requirement — which is the same as in (38) in 
Example 4.1. The hypothesis that fails is h{x, •) -continuity for some 7 > 
||(/i,(o,o))-iMx(o,o)||. 

The proof itself of Theorem 1“^ fails because, as this same example shows, 
the Non-C^ Implicit Function Theorem (Theorem 6 below) no longer holds 
under this weakened continuity hypothesis on h. 

Proof of Theorem 1. 

(A. Mixed) If (23b) holds, it is immediate that A and /x can be found satisfy- 
ing (26) and (27). So we assume that condition (23a) holds. Thus the index set^^ 

I is defined in (7). 
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/ = {1, . . . ,p} of binding constraints is nonempty, and rank(/i'('u)) = k < n. 
By the Reduction principle (16), it suffices to find A G IR^ and /i G IR^ satis- 
fying (26) and (27a). 

We complete the proof by contradiction. Suppose that, for some / G 
Td{u), there do not exist such nonnegative A G IR^ and a /i G IR^. Then 
by the Fundamental Lemma there exists a i G IR^ such that: 



g'i{u)z ^ 0 (47a) 

h'{u)z = 0 (47b) 

f'{u)z > 0. (47c) 

For all 5 G (0, 1] define: 

z{s) = z + s^, (48) 

for any ^ satisfying (23a). With (47a,b,c), and (23a) we then have: 

g'j{u)z{s) > 0 for all s G (0, 1] (49) 

h'{u)z{s) = 0 for all s G (0, 1] (50) 

f'{u) • 2:(5) = f{u) • z -f sf'{u) • ^ > 0, (51) 

for all small positive s. We define z = z{s) for one such small s. 



By the Jacobian Criterion (23a), there is a factoring IR^ = X x Y such 
that u — (x, y) and rank(/i'(fZ)) = ra.nk{hy{x, y)) = k. Then from h eHnc 
it follows that, for any x G X and for any small enough P \ 0 the function 
h{x -\-Px^ • ) is continuous on some neighborhood B^{yi) of y. 

Applying (133) to (49), (50), and (51) with z = (x, y): 

9iu ('^)^ = 9I. y)^ + 9iy y)y > O (52a) 

hu{y)z = hx{x,y)x H- hy{x,y)y = 0 (52b) 

fu{y) • 2 == /^(x, y)x + fy{x, y)y > 0. (52c) 

By the Non-C^ Implicit Function Theorem^^ there is an £ > 0, a subset 
Xo = Be(x) n {Px : j = 1,2,3... } £ 1R^~^, and a function ^( • ) : Xq — > 
IR^ that is differentiable at x, such that: 

y{x) = y (53a) 

/i(x, y{x)) = 0 for all x G Xq. (53b) 

y\x) = ~{hy{x, ^))“^/i^(x, y). (53c) 

By (52b) we also have: 

See the Appendix. 
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y = -{hy{x,y)) ^h^{x,y)x, (54) 

SO by the Chain Rule, (53c), and (54):^^ 

^k=o y{x H- tx) = y'{x)x = y. (55) 

Then the Chain Rule, (52a), and (52c) imply: 



~^\t=o 9i{x + tx,y{x ^tx)) = gi:^{x,y) • x + giy{x,y) • y > 0 (56a) 

^\t=o f{x-^tx,y{x + tx)) = fx{x,y) ^x^ fy{x,y) • ^ > 0, (56b) 

so for all small P >0, 

gi{x + Px, y{x -h Px)) > 0 (57a) 

gj{x + Px, y{x + Px)) ^ 0 (by (7)) (57b) 

f{x + Px, y{x + Px)) > f{x, y) (57c) 

h{x H- Px, y{x + Px)) — 0 (by (53b)). (57d) 

So for small enough P > 0 there are points (x + Px, y{x + Px)) G C{g, h) 
at which / attains a value higher than f{x,y), contradicting that u — {x,y) 
maximizes /, and concluding our proof for part (A). 



Proofs for parts (B) and (C) can be constructed along similar lines, but here 
are sketches. 

(B. Inequalities) If (24b) holds, then 0 belongs to the wedge generated by 
the rows of g'i{u), so it is again immediate that A can be found satisfying (28) 
and (29). So we assume that (24a) holds: g\{u)^ > 0 for some 

If no solution exists then by the Fundamental Lemma there exists a i G 
satisfying (47a) and (47c). For all 5 G (0, 1] define ^(s) as in (48), so (49) 
and (51) hold. Thus for small positive s values, z{s) belongs to the constraint 
set C{g), and also increases the maximand above f{u), contradicting the con- 
strained maximality of f{u). 

(C. Equalities) We repeat the proof for the mixed part (A), replacing ^ by i 
and omitting all statements concerning the inequality constraints g\ (47a), (49), 
(52a), (56a), and (57a,b). □ 

Proof of Theorem 1+. An almost identical proof applies. The only change is 
to choose any 7 > \\{hy{x, y)) ^hx{x, y)\\, and then replace the single neigh- 
borhood B^{y) by the neighborhoods (^) ^ 

The derivative is with respect to t G : j = 1, 2, 3, . . . } (our implicit function 
theorem only guaranteed that y{x + Px) is defined for such t). It is easy to verify 
that the Chain Rule does apply in this context. 
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5. The Jacobian criterion is minimal 

Section 4. showed the Jacobian Criterion is sufficient for Lagrange regularity. 
Example 6.1 below, p. 124, shows it is not necessary. Indeed, it shows that no 
Jacobian constraint qualification can be both necessary and sufficient for La- 
grange regularity. Jacobian conditions are blunt instruments for detecting La- 
grange regularity.^^ Nevertheless, Jacobian conditions are of special interest, 
because they typically have computability properties useful in practical appli- 
cations.^^ So it is important to show that the Jacobian Criterion of the previ- 
ous sections is as weak as possible among Jacobian conditions: if one restricts 
oneself to Jacobian conditions sufficient for Lagrange regularity, the Jacobian 
Criterion is minimal (over an appropriate class of constraint functions). 

In contrast with sufficiency theorems, weaker smoothness hypotheses on 
the constraint functions do not necessarily make better minimality theorems. 
To prove minimality of a Jacobian condition for classes Q and H of inequality 
and equality constraints, we must show that failure of the condition for some 
g and h implies there exist g and h in the same classes Q and Ti, respectively, 
that are not Lagrange regular. The smaller the classes Q and W, the harder it 
may be to obtain such g and h. (And there is no guarantee that every class H 
will have a minimally sufficient Jacobian condition for Lagrange regularity."^^) 
We present below a minimality result for any classes Q and H that contain at 
least the quadratic functions. 

Theorem 2. Let U be an open subset of IBP , let u ^ U Q 1R^~^ x IR^, 
and suppose that function classes Q Q Qnip) cii^d H ^ contain all 

quadratic functions 

A) (Mixed: Equalities and inequalities) If the Jacobian Criterion (A) 

is -sufficient for Lagrange mixed-regularity, then it is 

minimally (fi, 7Y, sufficient for Lagrange mixed-regularity. 

B) (Inequalities) If the Jacobian Criterion (B) is Q , Td{u)) - sufficient 
for Lagrange inequality -regularity, then it is minimally 

sufficient for Lagrange inequality-regularity. 



Although we obtain a necessary and sufficient constraint qualification in Section 6. 
(Theorems 3 and 4) below, it is a “path condition,” not a Jacobian condition. 

By contrast, path conditions (Section 6) may be difficult to apply. 

An extreme example is any class H consisting of a singleton h. If there is a Jacobian 
condition Q that is (fi, H, To (fi)) -sufficient for regularity, then any h with h'{u) ^ 
Q must have h'{u) h!{u), and so cannot belong to B. 

See p. 102 for definitions of Lagrange regularity, and p. 106 for definitions of min- 
imal sufficiency. 

In fact, as seen from the proof below. Theorem 2 remains true even if the class 
of maximands is as narrow as the class of linear functions. 
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C) (Equalities) If the Jacobian Criterion (C) is {u^H^ Td{u)) - sufficient 
for Lagrange equality -regularity, then it is minimally {u,H^Td{u))- 
sufficient for Lagrange equality-regularity. 

In particular, therefore, by Theorem 1 we have this corollary ior Q — Qd 
and H = Hdc'^ 

Corollary of Theorem 2. Let U be an open subset oflR^ and let u — (x, y) G 

UQm^. 

A) (Mixed: Equalities and inequalities) The Jacobian Criterion (A) is 

minimally {u,GD{h)^'HDc{h), -sufficient for Lagrange mixed- 

regularity. 

B) (Inequalities) The Jacobian Criterion (B) is minimally {u,Go{u),J^D{h))~ 
sufficient for Lagrange inequality-regularity. 

C) (Equalities) The Jacobian Criterion (C), 

rank(/i'(u)) = min{/c,? 2 }, (58) 

is minimally {u^Hocih), Td{u)) - sufficient for Lagrange equality- 
regularity. 

Proof of Theorem 2. 

(Part (A)) Let the Jacobian Criterion (A) be (fZ, JF/)(u))-sufficient 
for Lagrange mixed-regularity, so property (i) of the definition of minimal suf- 
ficiency"^^ holds. 

To see that part (ii) of the minimality definition holds, suppose some g ^ G 
and h ^ H do not satisfy the Jacobian Criterion (A). Then p > 0 or fc > 0 
by Remark 3.1; in what follows, references to g should be disregarded when 
p = 0, and references to h should be disregarded when k = 0. It then suffices 
to show there exist g £ G and h ^ H with g'j{u) = g'j{u) and h'{u) = h'{u), 
for which (p, h) is not Lagrange regular for (u, Td{u)). 

Without loss of generality we assume f/ is a neighborhood of the origin 
0 G IR!^, and that u = 0. Define matrix A — P/(0) and matrix B = h'{0). 
Since (p, h) violates the Jacobian Criterion (A), we know that (A, B) cannot 
satisfy either of these two mutually exclusive conditions: 

a) rank(5) = k < n and there exists a G LFT^ such that: (59a) 

A^>0 
B^ = 0. 



42 



Page 106 




122 L. Hurwicz, M.K. Richter 



b) wedge({a(l), . . . , a{p)}) + span({6(l), . . . , b{k)}) = (59b) 

i.e.: 



{^ia(l) + • • • + + • • • + Zkb{k ) : 

reaHi, . . . ^ 0 & . . . ,Zk ^ IR] = . 

It will suffice to find g and h with g\{u) = g'j{u) and h'{u) = h'{u), and such 
that C{g, h) = {0}. For then (^, h) is not Lagrange regular, because the failure 
of (59b) implies that there is some 7 G IRJ^ with: 

7 i wedge({a(l), . . . , a{p)}) + span({5(l), . . . , b{k)}); (60) 

so if we define f{x) = - j‘X then / is maximized on C{g, h) only at the origin 
0, and -/'(O) = 7 , which by (60) cannot satisfy (9) and (10) as required by 
Lagrange regularity of 

To find such g and h, first consider the special case that some row a{i) of 
A equals 0 G IR^. Then for x G IR^ define g'^{x) = -(xf + • • • + x^), so 
a(i) = p^'(O), and g^{x) ^ 0 if and only if x = 0 ; define g^ = g^ for all 
j ^ i. Since Q contains all quadratic functions, g e Q\ also gj{0) — A, and 
h) = {0}. Since g'^ is clearly binding at iZ = 0, the same functions in g 
as in g are binding at iZ = 0. So gj = gi. We can then define h = h, and the 
proof is finished for the special case when some row a{i) = 0 . 

Similarly, if some row b{j) = 0 G JR^, define g = g and (xi , . . . , x^) == 
— (xf + • • • + x^) and h'^ = for all i 7 ^ j; by a similar argument, h e H, 
h'{0) = B, and C{g, h) = {0}. Since gj — gi and ^'(0) = h'{0), the proof is 
finished for the special case when some row b{j) = 0 . 

It remains to consider the case that: 

a{i) y^O foralH = 

b{j)^0 forall j = 

We noted that (59a) fails. There are two ways it can fail: i) we might have 
rank(5) < fc < n, or ii) there might exist no ^ with > 0 and = 0 . 

(i) First consider the case that rank(B) < k. Then (61) implies rank(5) > 
0, hence k > 1. Then some row of B is a linear combination of the others, say: 

b{l) = Z 2 b{ 2 ) H + Zkb{k) for some real 2 : 2 , , Zk- (62) 

By (61) we have 6(1) 7 ^ 0, so without loss of generality we can choose a basis 
so that: 

6 ( 1 ) - ( 1 , 0 ,..., 0 ). (63) 



Now define: 
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g\xi , . . . , Xn) = ci{i) ' X for alH = 1, . . . 

. . . ,x„) ^xx-{xl + --- + xi) ( 64 ) 

h^{xi, . . . , Xn) — b{i) ' X for alH = 2, . . . , fc. 



Then g\{0) = A = d'lip) — B — /i'(0), g E G and h e H\ 

and if ^(x) ^ 0 & /i(x) = 0 it follows from (62), (63), and (64) that 

xi = • • • = Xn = 0, so again the constraint set contains just the origin: 
C{g, h) = {0}, and we are done for case (i). 

(ii) Property (59a) could also fail by absence of ^ G IR^ with > 0 and 
— 0. Then the Theorem of the Altemative,"^^ yields t G v G IR!^ with: 



t > 0 

t^A + v'^B = 0 G 

Without loss of generality we can rewrite this as: 



(65a) 



a(l) == -{w2a{2) H h Wma{p) + 2 : 16 ( 1 ) H h Zkb{k)), (65b) 

for some real Zi and some real Wj ^ 0. Because of (61) we can, also without 
loss of generality, choose a basis so that: 



a(l) = (1,0,..., 0) G ]R^. 



( 66 ) 



Now define: 



g\xi,...,Xn) =Xi-{xj + \-xl) 

. . . , Xn) = ci{i) -X for alH = 2, . . . ,p (67) 

/i’ (xi, . . . , Xn) = b{i) • X for all i = 1, . . . , /c. 

Since g^ is clearly binding at = 0, it follows that g'j{0) = A — g'j{0) 
and h'{0) — B = /i'(0), g £ G and h e H, and if g{x) ^ 0 & h{x) — 0 
it follows from (65), (66), and (67) that xi = • • • = Xn = 0; so again the 
constraint set contains just the origin: C{g, h) = {0}, and we are done for this 
last case (ii). 

At this point we have shown for part (A) that C{g/h) = {0}, which by the 
sufficiency argument following (59b) completes the proof for part (A). 

(Parts (B) and (C)) These follow from obvious modifications of our proof 
for Part (A), or as a corollary of that part if we set /i = 0 or ^ = 0. 

□ 
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See the Appendix, applying the special case of part (11) when B and j3 are absent. 
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6. The tangency-path criterion 



Why consider more than the Jacobian? Because the Lagrange multiplier 
property (9) is a local property, one might expect that Jacobian properties 
would suffice to characterize conditions under which Lagrange regularity holds 
for (g^h). However, they cannot determine Lagrange regularity in all cases, as 
we see from the following example. 

Example 6.1. Consider the inequalities ^ ^ 0 where: 



g^{ui,U2) = I 



U 2 — u\ sin(l/7ii), if 7 ^ 0 



0 



otherwise 



g^{ui,U2) = -U2. 



( 68 ) 



It will be evident from Theorem 3 that, for any partially differentiable function 
/ that is maximized dXu = (0, 0) subject to the constraints (68), there do exist 
Lagrange multipliers satisfying (9) and (10). By contrast, the function g of (5) 
is not Lagrange inequality-regular for even the class of linear functions, even 
though it has the same Jacobian ^'(0, 0) as (68) above."^ Similar remarks apply 
to equality constraints. 

Ensuring Lagrange regularity amounts to guaranteeing existence of feasi- 
ble paths whose tangents have certain algebraic properties (see our discussion 
of (21) in Part 2.2). Such paths were provided in our proof Theorem 1 through 
Jacobian conditions on g and h. Now, in the spirit of Karush, Kuhn and Tucker, 
and others, we will obtain them by directly assuming their existence, but with 
the weakest possible assumptions. 

Tangency-path criterion. Suppose u e U ^ IR^, and let (g^h): U 

mr X ib!^. 



A) (Mixed: Equalities and inequalities) We say that (^, h) satisfies the 
Mixed-Problem Tangency-Path Criterion at u if: 

i) all the g'^ and h'^ have partial derivatives at u, 

ii) for any v e 



^ v^clich{T^C{g,h))). (69a) 



Cf. the discussion of (5), p. 101. 
I is defined in (7). 
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We can also write (ii) as: 



L{g,h)QV{g,h), (69b) 

where we define; 

L{g, ft) = {v e iR” : gi'{u)v ^ 0 & h\u)v = 0}, (70) 

and: 

V{g,h)^d{di{TuC{g,h)))^ (71) 

B) (Inequalities) We say that {g^h) satisfies the Inequalities-Problem Tangency- 
Path Criterion at u if: 

i) all the g^ have partial derivatives at u, 

ii) for any v G IR^, 

g/{u)v^0 ^ vecl{ch{TuC{g))), (72a) 



i.e. 



L{g) Q V{g) (72b) 

in notation paralleling (69b). 

C) (Equalities) We say that h satisfies the Equalities-Problem Tangency- 
Path Criterion at u if: 

i) all the have partial derivatives at u, 

ii) for any v G 



h\u)v^f) ^ V ed{Qh{TuC{h))). (73a) 



L{h) g V{h) (73b) 

in notation paralleling (69b). 

A path interpretation of the criterion. Paraphrased in terms of binding 
constraints g\ (72a) says that if a vector v has a nonnegative inner product 
g^'{u) • V with each g'^'{u), then is a limit of positive linear combinations 
of tangents of “feasible” paths. In particular, if the g^ were differentiable^^ 
then any direction v in which all the g^ had positive derivatives would be, if 
not a direction (i.e., tangent) of a feasible path, then at least a positive linear 
combination of directions of such paths. 

Not merely possessing partial derivatives. 
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Remark 6.1. 

i) One might consider the simpler condition obtained from replacing 
cl{ch{TuC{g))), for example, by its subset ch{TuC{g)):'^^ 

L{g) g ch{TuC{g)). (74) 

However, that would impose a stronger requirement on constraints than 
does (72b), as could be seen from this example: 

g\x,y,z)=z^-y^-{x-\\z\\f 

2/ \ O^) 

9 {x,y,z) - 

ii) If we replace g^ by 



g\x,y,z) = 



g^{x, y, z)/^/{x‘^ +y^ + z^), 

0 , 



if (x,y,z) i- (0,0,0) 
Otherwise, 



then has the same partial derivatives at the origin (0,0,0) as in (75), 

but it is no longer Frechet differentiable at the origin, even though it defines the 
same constraint set as This illustrates the fact that Theorem 3 guar- 

antees Lagrange regularity even when full differentiability does not apply. The 
example still exhibits a discrepancy between c\{ch.{ToC{g)) and ch(ToC{g)). 

iii) Without the system (75) would not be Lagrange regular, even though 
the constraint set would be the same. This again emphasizes that Lagrange 
regularity is a property of constraint functions, not just of constraint sets.^^ 

We will show that the Tangency-Path Criterion is necessary and sufficient 
for Lagrange regularity. By the definition of the Tangency-Path Criterion, the 
functions g^ and in the following theorem have partial derivatives; but it is 
not assumed that they are Frechet, or even Gateaux differentiable.^^ 



Cf. [13]. 

Cf. our comments on this dependence above, page 103, and our comments on Re- 
movable Constraints above, p. 109, for Jacobian conditions. Also see [2], footnote 2, 
p. 2, where again the form of constraints influences whether the rank constraint 
qualification holds or not — indeed, whether the inequality constraint is Lagrange 
regular or not. 

Gould and Tolle [15], p. 167, presupposing differentiability and continuity condi- 
tions that are not made in Theorems 3, 4, 5, or 6 below, state a criterion that is 
the dual of a condition stronger than the Tangency-Path Criterion. (See Section 8. 
below.) 

Our Theorem 3 is a strengthening of Arrow, Hurwicz, and Uzawa’s Theorem 1 
[4], as well as of the “if” part of Gould and Tolle’s theorem. Our Theorem 4 is a 
strengthening of Theorem 2 in Arrow, Hurwicz, and Uzawa [4]; it plays the same 
role, under weaker hypotheses, as the “only if” part of Gould and Tolle’s theorem. 
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Theorem 3. The Tangency-path criterion is sufficient for Lagrange regu- 
larity. Let U Q ]R^, and let u be a limit point. 

A) (Mixed: Equalities and inequalities) The Mixed-Problem Tangency- 
Path Criterion is sufficient for Lagrange mixed-regularity for (ti, Td{u)). 

In other words, if f : U IR is differentiable at u, if g: U — > ]R^, 
if h: U JR^, if u maximizes f subject to g ^ ^ and h — and 
if (g^h) satisfies the Mixed-Problem Tangency-Path Criterion at u, then 
there exists a X e and ape IR^ such that: 

f(uf + A V(w) + ph'{u) = 0, (77) 

A ^ 0 (78a) 

A, = 0 ifi G J. (78b) 

B) (Inequalities) The Inequalities-Problem Tangency-Path Criterion is suf- 
ficient for Lagrange inequality-regularity for (u^ d^uid)). 

In other words, if g: U — > IR^, ifu maximizes f : U IR subject 
to g ^ 0, if f is differentiable at u, and if g satisfies the Inequalities- 
Problem Tangency-Path Criterion at u, then there exists a X e IR^ such 
that: 

f{uf + AV(w) = 0, (79) 

A ^ 0 (80a) 

Aj = 0 if j e J. (80b) 

C) (Equalities) The Equalities-Problem Tangency-Path Criterion is suffi- 
cient for Lagrange equality-regularity for (iZ, IFd{u))‘ 

In other words, ifh: U IR^, ifu maximizes / : U —^IR subject 
to h = 0, if f is differentiable at u, and if the h satisfies the Equalities- 
Problem Tangency-Path Criterion at u, then there exists ape IR^ such 
that: 

f'{u)^ + p^h'(u) — 0. (81) 

Remark 6.2. Theorem 3’s sufficiency result does not require that the domain 
U be open, in contrast to Theorem I’s sufficiency result. (Also in contrast to 
Theorem I’s Jacobian condition (25), our Tangency-Path Criterion fails for 
the function h in the non-Lagrange-regular example (35), when applied to the 
nonnegative quadrant of IR^.^^) 



This does not contradict Theorem 1 .C, which required the set U to be open. 
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Conversion of equalities to inequalities. Before proving Theorem 3, we 
note that, when working with constraint qualifications in the form of path con- 
ditions, it is sometimes convenient to convert each equality /i* — 0 into two 
inequalities ^ 0 and g'^'^ ^ 0, where 



r 

r = -h\ 



(82) 



Thus from the original mixed constraints {g^h), we obtain a corresponding set 
g{h) of just inequality constraints.^^ Clearly the original constraint set remains 
the same: C{g{h)) = C{h). So the problem of maximizing a function f onU 
subject to g{h) ^ 0 is equivalent to maximizing f on U subject io h = 0. 

While the constrained optimization problems are equivalent for g{h) and h, 
we must ask whether our constraint qualifications remain the same in passing 
from h to g{h). For our Jacobian conditions (25) and (24), the answer is no.^^ 
But for our path-type constraint qualifications (73) and (72), the answer is 
yes, as we now show. 

Lemma 6.1 (Conversion lemma). For any inequality constraint functions g 
and equality constraint functions h that have partial derivatives atu^ U: 

a) i) The constraints {g^h) satisfy the Mixed Tangency-Path Criterion (69) 

if and only if the associated inequalities g{h) satisfy the Inequali- 
ties Tangency-Path Criterion (72). 

ii) The constraint h satisfy the Equality Tangency-Path Criterion (73) 
if and only if the associated inequalities g(h) satisfy the Inequali- 
ties Tangency-Path Criterion (72). 

b) For any function f:U-^IR that has partial derivatives at u, there exist 
A ^ 0, with Xi = 0 if g^ is not binding, and p such that: 

f(u) 4- Xg’j{u) -f- ph'(u) = 0 (83) 

if and only if there exist A ^ 0, with Xi = 0 if g{h)'^ is not binding, and 
such that: 

f{u)^Xg{hyj{u)=0. (84) 

Proof (Part (a.i)) First note that, because {g, h) and g{h) define the same con- 
straint set C{g, h) = C{g{h)), the right hand side c\{ch{TuC{g, h))) of (69a) 
is the same as the right hand side ci{ch{TuC{g{h))) of (72a). So it suffices to 



When h is absent, then g{h) is understood to be g, and when g is absent, then g(h) 
is understood to contain the inequalities specified in (82). 

For example, if g{h) = (^ 1 ,^ 2 ) = (^ 1 , —gi) ^ 0 corresponds io h = 0, then 
clearly there can be no vector ^ such that g{h)'{y)^ > 0. 
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prove that the left hand sides of (69a) and (72a) are equivalent. And for that it 
clearly suffices to show that, for each z = 1, . . . , A:, 

h^\u)v = 0 

{ g'^^\u)v ^ 0, if is binding and is not 

g'^'^\u)v ^ 0, if is binding and g^~^ is not 

g'^^'{u)v ^ 0 & g^^'{u)v ^ 0, if g^^ and g^^ are both binding. 

(85) 

Suppose first that 

h^\u)v = 0. (86) 

Then it is immediate from the definition of the g^ in (82) that g^{u)'v = Ofor 
both j = l and j — 2. 

Before proving the converse, we first note that: 

if g^^ or g^'^ is not binding at u, 
then 0 = h\u) — g^^'[u) = g'^'^\u). 

For if g^^ is not binding, then there is a neighborhood of u in which g^^ is 
nonnegative. Since g'^^ (u) = h'^{u) = 0, this means that g'^^ is minimized at u 
in that neighborhood, and so 0 = g^^'{u) — h^\u), and g'^‘^{u) = -h'^'{u) = 

0. 

Now to prove the converse in (85), suppose that one of the two first con- 
ditions on the right hand side is satisfied. Since at least one of the g^ is not 
binding, it follows from (87) that h'{u) = 0, so k''{u)v — 0. 

Finally, suppose the last case on the right hand side holds. By defini- 
tion (82), that implies h'{u)v ^ 0 and —h'{u)v ^ 0, from which the left 
hand side of (85) follows immediately. 

(Part (a.ii)) Apply an argument similar to that for (a.i). 

(Part (b)) First, suppose there exist A ^ 0, with = 0 if g^ is not binding, 
and /i as in (83). For each z = 1, . . . , A: with gi = 0, we choose A^^ = Xi^ = 0. 
For each gi / 0, we choose any A^^ ^ 0 and Xi^ ^ 0 such that A^^ — Xi^ = /z^; 
then: 

X,J{hy^\u) + ~K,~g{hy^\u) = X,,h^\u) - \X{u) 

= (88) 

With those A^, (84) is satisfied. And we can further choose A^^ = A ^2 = 0 
if either g^^ or g^"^ is not binding, since in that case g'^^\u) = g^‘^'{u) — 0 
by (87). 

Conversely, suppose there exists a A ^ 0, with A^ = 0 if g{hy is not 
binding, and satisfying (84). We must show that there exist A ^ 0, with A^ = 0 
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if is not binding, and // satisfying (83). We begin by assigning for 

each = g{h)\ Now g^ is binding if and only if the corresponding g{h) = g^ 
is binding, so our hypothesis for this converse in part (b) implies that A^ == 0 if 
g^ is not binding. 

Next, for elements of g{h) that are of the form g^^ or g^^ associated with 
some there are two cases to consider in defining a fii to satisfy (83). 

If both g^^ = and g'^^ = —h^ are binding, then we define fii = Xi^— Xi^, 

If either of g^^ or is not binding, then by (87) we have g^^\u) = g^^'{u) = 
h'{u) = 0 , and we define fii = 0 . 

With those A^ and fXi, (83) is satisfied. 

□ 

Because of the Conversion Lemma, we can restrict discussion of path con- 
ditions to inequality constraints. The results will apply as well to equality and 
mixed constraints. 

Proof of Theorem 3. 

Parts (A) and (C) follow immediately from part (B) and the Conversion 
Lemma, by converting equality constraints h = 0 into inequality constraints 
g{h) ^ 0 , as in (82). 

In the mixed case (A), for example, if the Mixed Tangency-Path Criterion 
holds for constraints {g, h), then, by the “only if” in (a.i) of the Conversion 
Lemma, the Inequalities Tangency-Path Criterion holds for the associated in- 
equality constraints g{h). So by the sufficiency proof below for part (B), for any 
/ with partial derivatives there exist A ^ 0 , with A^ = 0 if g{hy is not bind- 
ing, satisfying (84); hence by the “if” in (b) of the Conversion Lemma there 
exist A ^ 0, with A^ = 0 if g^ is not binding, and /i satisfying (83). A similar 
argument shows that the Equalities Tangency-Path Criterion is sufficient for 
Lagrange regularity with just equality constraints. 

(B. Inequalities) For each j with j G J, define: 

Xj = 0. (89) 

Then (80b) is satisfied. If the index set^^ / of binding constraints is empty, we 
are done, since then u is in the interior of the constraint set, so: 

^{u) = 0 (j = l,...,n), (90) 

which together with (89) satisfies (79). So we assume / = {1, . . . ,p} 7 ^ 0. 

By the Reduction principle (16), it suffices to find a A G IR^ satisfying 
X'^ g'j{u) = —f\uY' and A ^ 0. Arguing by contradiction, suppose there 



53 



I is defined in (7). 
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does not exist such a A. Then the Fundamental Lemma implies there exists a 
z G IR^ such that: 



g',{u)z ^ 0 (91a) 

f{u) • 5 > 0. (91b) 

By the Tangency-Path Criterion,^'* property (91a) implies the existence of a 
sequence of tc* e ch(TuC{g)) with: 

w* ^ z. (92) 

l-^OO 

For all large enough z, therefore, (91b) implies: 

> 0. (93) 

By (138a), for each i there exist nonnegative . . . , ^ summing to 1, and 

there exist . . . , G TuC{g) with:^^ 

z/;' = ^^,lzz;'’^ H h (94) 

It follows from (93) and (94) that: 

/'(zz) -zz;^’^‘ > 0 (95) 

for some large i and some j = 1, . . . , n. Since G TuC{g) and 7 ^ 0 (by 
(95)), it follows by (141) and the Proposition on Paths and Tangent Cones (146) 
that there is a path 0: T ^ [0, 1) C{g) with 0(0) = u and 0'(O) = zz;^’L 
Then: 

|l..,/(0«)) = /«.0'(O) 

= /(zz) • zz;*’-^ > 0 . 

So, for some small t > 0, 

> /(</>(0)) = f{u), (97) 

contradicting that u maximizes /. □ 

Theorem 3 shows that the Tangency-Path Criterion^^ is strong enough for 

Lagrange regularity. Now we show it is not too strong. 



Page 124. 

Actually, Caratheodory’s Theorem ensures that we can choose q = n-\-l (cf. [37], 
p. 155 (Theorem 17.1)). In fact, since TuC{g) is a cone, we can choose q = n 
([37], p. 156, (Theorem 17.1.2)). 

See definition page 124. 
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Theorem 4. The Tangency-path criterion is necessary for Lagrange regu- 
larity. Let U £ IR^, and let u e. U be a limit point. Suppose g: U ^ IR^ 
and h: U IR^ have partial derivatives at u. 

A) (Mixed: Equalities and inequalities) If{g, h) is Lagrange mixed-regular 

for then {g, h) satisfies the Mixed Problem Tangency-Path 

Criterion. 

B) (Inequalities) If g is Lagrange inequality -regular for then 

g satisfies the Inequality-Problem Tangency-Path Criterion. 

C) (Equalities) Ifh is Lagrange equality-regular for {u^Td{u)), then h 
satisfies the Equality -Problem Tangency-Path Criterion. 

Remark 6.3. Under the other hypotheses of Theorems 1 and 1+, the Jacobian 
Criterion is strictly stronger than the Tangency-Path Criterion. There are two 
ways to see that the Jacobian hypotheses imply the Path Criterion. First, under 
the other hypotheses together with the Jacobian conditions, Theorems 1 and 1“^ 
imply Lagrange regularity, so Theorem 4 implies that the Path Criterion holds. 
More directly, from examining the proofs of Theorems 1 and 1+ it is clear that 
the use we made of the Jacobian Criterion was precisely to obtain a path whose 
derivative satisfies the Path Criterion. 

On the other hand, even with those other hypotheses, the Path Crite- 
rion does not imply the Jacobian Criterion. That is clear from Example 6.1, 
page 124, where the Jacobian Criterion clearly fails at (0, 0), but we can easily 
verify directly that the Path Criterion holds at (0,0).^^ Alternatively, we can 
check that Lagrange regularity holds there,^^ so again by Theorem 4 the Path 
Criterion holds there. 

Proof. 

Parts (A) and (C) again follow from part (B) and the Conversion Lemma, 
by converting equality constraints = 0 into inequality constraints g{h) ^ 0, 
as in (82). 

In the mixed case (A), for example, if (^, h) is Lagrange-regular, we con- 
vert the equality h constraint functions to obtain inequalities g{h), including 
the original inequality constraints g. We then use the proof below for part (B) 

The vectors v = (fi, t’ 2 ) satisfying the hypothesis of the Path Criterion (72) are 
V = (vi, 0) for arbitrary vi. On the other hand, the constraint set C{g^,g‘^) is the 
part of the plane above the graph of the function with values ul sin(l/iii) 
for nonzero ui, and with value 0 for ui = 0; so the paths from (0, 0) into the graph 
of the function -0 have derivative (ifi, 0) for some wi, and for every wi there is 
a path into the graph with that derivative. Thus any v satisfying the left hand side 
of (72) belongs to the right hand side. 

It is clear from the preceding footnote that any function / maximized at (0, 0) has 
df/dui = 0 there. Since we also have each df jdux = 0 as well as dg^ ju 2 — 1 
and dg^ ju 2 — —I there, it is easy to choose nonnegative Ai and A 2 to obtain the 
Lagrange condition (1 1) for each such maximand /. 
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to establish that the Inequalities Tangency-Path Criterion (72) holds for g{h). 
By the “if” part in (a.i) of the Conversion Lemma, it follows that the Mixed 
Tangency-Path Criterion (69) holds for {g, h). A similar argument shows that 
the Equalities Tangency-Path Criterion (73) is necessary for Lagrange regular- 
ity with just equality constraints. 

(B. Inequalities) Suppose g is Lagrange inequality-regular for {u, 
so the g^ have partial derivatives at u. To verify (72), we must show: for any 

g'i{u)z^0 => zeV{g). (98) 

Without loss of generality, we will suppose that u is the origin. 

a) If there are no binding constraints g\ then L{g) = 0, hence (98) is 
vacuously true. So now we suppose that there are binding constraints, i.e.,^^ 
the set / = {1, ... ,p} is nonempty. 

b) For a proof by contradiction, suppose (98) is not true, so there is a z G 
L{g)\V{g).Nov/V{g) = cl(ch(T^C(p)))) is a closed convex cone, not equal 
to IR^ (it does not contain z); so there exists^^ a g G such that: 



q^O (99a) 

q • u < 0 for all G y (g) with — u ^V{g) (99b) 

q - u = for all u G V {g) with — u eV{g) (99c) 

(7 • z > 0 (99d) 

q - q — l. (99e) 

We define the hyperplane Hq by: 

Hq^ {ue IR^: q-u = 0}, (100) 

so q is orthogonal to Hq. And we define the subspace orthogonal to Hq\ 

Q = {tq: t G iR}, (101) 

so IR^ is the direct sum: 

IR^ = Hq^Q. (102) 

c) In what follows we shall define a differentiable function / : IR^ IR 



that is maximized on C{g) at u = 0, and for which /'(O) = q. Then by part (a) 
above and Lagrange inequality-regularity, there must exist ^ 0 such that 
/'(O)--ELi W'(0),so; 

See definition page 103. 

/ is defined in (7). 

See the definition of binding constraint, page 101, and the notation established in 
(7) and (8). 

Cf. [26], p. 315, Theorem 2.7. 
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q-z = f{0)-z (103a) 

V 

= -5^W'(0)-^ (103b) 

i=l 

^ 0 (by (70), since z G L{g)), (103c) 



which contradicts (99d). 

d) Finding a function / as in (c) is equivalent to finding a function 

/: ^ IR that is differentiable at 0, has fuif^) = Q, and is maximized 

on C{g) at 0 . 

e) Our intuition in defining / is simple. If all of C{g) lies “below” the 
hyperplane Hq, then f{x + tq) = t (for all x e Hq and tq e Q) clearly satisfies 
the requirements of part (c) above. Now properties (99b, c) almost imply that 
V {g), hence its subset C{g), does lie below the hyperplane Hq. However they 
in fact allow C{g) itself to rise “above” Hq — though only “gradually,” much 
as the function y = x‘^ rises above the x-axis. So we will define the graph of 
a function 7 below as the “upper boundary” of C{g), and show it has (like the 
function x^) a zero derivative at 0 ; then we define the level sets of the desired 
function / through shifts of the graph of 7 . 

f) We define: 

B^ = {ueu-. Nl^^} 

= C{g) n s'' (104) 




g) To define the function / we will first define, for each k e IN, a function 
7^: iJg IRU {— cxd}; then we will change 7^ into 7^ so that its values 
belong to the ball finally, we will define / in terms of 7 ^. 

We define for each k G IN, the function : Hq ^ IRU {— cxo} by:^^ 

I — 00 , otherwise. 



Because ^ B^, the values of 7 ^ are either finite or - 00 . We note that: 

7 ^(x) ^ t for all x-\-tq e C^. (106) 

To obtain a function bounded below, we define: 



We use the supremum rather than the maximum because the sets are not neces- 
sarily closed. That is because the constraint functions g^ are only assumed to have 
partial derivatives, and only at the origin, so they are not necessarily continuous in 
any neighborhood of 0. 
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7 ^(x) — max{0, for all x e Hq, 



so: 



7^ ^ 7^ 



and 7 ^ ^ 0. 
h) Next we show that for all large enough k e IN: 

7(0) = 0. 

First we note that for large enough k € IN: 

o<ten tq e cl(C^). 



(107) 

(108) 

(109) 

(110) 



Otherwise there would exist a sequence of points + t*^q) € with > 0 
and: 



< _ 

— 2^ fc— »oo 



0 



( 111 ) 



andx^+f^g 



k~^oo 



0 (since G C^). So by (140a) and compactness 



of the unit ball, there is a subsequence (re-using index k for convenience) with: 

some w G ToC{g). (1 12) 






Since w G ToC{g) £ V{g), properties (99b, c) imply: 

q-w^O. (113) 

But using (1 1 1) we see that, in the maximum norm we have, for all large k: 

Q 



xk ^kq ^k ^kq 

\\x>^+t^q\\ \\t^q\\ 



k-^oo 



(114) 



sow = Then (1 13) implies q • q ^ 0, so q — 0, contradicting (99a) and 

completing the proof of (1 10). So for all large enough k e IN we have: 



tqec\{C^) ^ t^Q, 

and therefore (109) holds. 

i) Now we show that, for all sufficiently large k G IN: 



7 ^ is differentiable at 0 G iTg 
Let x" G Hq with 0 ||x 



and 



r '(0) = 0. 



> 0. We must show that: 

|7(a:*) -7(0)1 



1^00 
s/k ( nr>'i'\ r\/k / 



||x* — 0|| 



0 e JR, 



015) 



(116) 



(117) 
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which by (108) and (109) means: 

7^(x^) 



0 . 



(118) 



We only need consider subsequences with 7 ^(x^) > 0, hence 7 ^(x*) == 
7 ^(x^), and show that any such “positive” subsequence itself has a subsequence 
satisfying (118). Now on any subsequence of a positive subsequence (indexed 
by‘‘j”): 

= -^0- (119) 

since otherwise this subsequence itself would have a subsequence converging 
to some ^ > 0; but then (by (105)) there would exist + Pq G converging 
to tq G cl(C^), contradicting (110). 

Now (119) implies: 



0 ^ \\x^ +'y^{x^)q\\ 



j~*oo 



0 , 



( 120 ) 



so by compactness of the unit ball, there is a subsequence of the j’s on which: 

xj 7- 7^(x-^ )g 



J^OO 



some w. 



( 121 ) 



||xf +7^(x^>ll 

We will show that (118) holds on any such subsequence (121). First note that: 

weToC{g). (122) 

By (120) and (121), that would be immediate from (140) if x^ -\-j^{x^)q G 
but (105) only requires 7 ^(x-^) to be the sup of t for which x^ + tqe 
However, (105) and (121) imply there are P with 0 ^ x^ + Pq G \\x^ + 
Pq\\ 0, and: 

x^ -\-Pq 

SO (122) holds. 

From (122) and (99b,c) we see that: 



J-^OO 



0^ q ' w. 



(124) 



Then (121) and (124) imply: 



See footnote 63, page 134. 
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xJ 

^ \\x^ ^^^{x^)q\\ 

\\xi +-i^{x:i)q\\^ ^ 
7^(x-^) 

= p|jT]ppM 



(since x^ ^ Hg) 

(using (99e) and the sum norm) 



^ 0 (since 'y^(x^) > 0), 



which implies (118). 

j) Now we define / : IR^ IR by:^^ 



(125) 



f(u) = t-J^{x) 

for all u G IR^ with u x + tq e Hg ® Q, 



(126) 



where K is any k so large that (109) and (116) hold, as in parts (h) and (i). 
Clearly 0 G JR^ maximizes / on since /(O) — 0 by (126) and (109), and 
if X + G £ C{g) then /(x -\-tq)^0 by (126), (106), and (108). 

By parts (c) and (d) above, it only remains to show that / is differentiable 
at the origin 0 G iR^, with /^(O) = q. And for that it suffices to show: 



/(x + tq) - /(O) = fu{^) • (x + tq) + o(||x + tq\\) 
= g - (x + tg) + o(||x + tg||). 

Since /(O) =0 and • x = 0 for x G iifq, this becomes: 

/(x-hfg) = f -ho(||x + f^||), 



(127) 



(128) 



i.e., 



\\x-{-tq\\ || 2 ;+tgl |^0 

Using the sum norm,^^ this becomes: 



7^(^) 

ll^ll + \M\ 



llx+tqlHO 



(129) 



(130) 



which follows immediately from (116) and the fact that ||x 4- tg|| — > 0 implies 
both ||x|| — > 0 and \\tq\\ 0, since x G Hg, tq G Q, and H ± Q. □ 

If we wanted / to have a unique maximum at the origin, we could instead define it 
by: 

f{u) = t-j^{x) - ||x||^ 

where || • |j is the Euclidean norm. 

See (135c), page 139. 
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7. Appendix 

Notation and terminology. 

The set of natural numbers isW = {0,1,2,3,...}, and the set of real 
numbers is IR. For u G IR^ = JR^ x IR^ we write u = {x,y), where x G IR^ 
and y G IR^ . If C/ £ IR^, then we say that u G IR^ is a limit point (or 
accumulation point) of U if every neighborhood of u contains a point of U 
other than u. 

Suppose F: U ^ IR^ IR^ and ^ G is a limit point of U (which is 
not necessarily open). Then we say that F has a Frechet derivative of F at it if: 
there exists a linear transformation L : IR^ — > IR^ such that 

F{u) = F{u) + L{u - u) F o{\\u - uH) for all u eU. (131) 

We will denote such a linear transformation by Fu{u). Differentiability will 
always mean Frechet differentiability unless otherwise noted. When U is not 
an open set, the transformation L may not be unique. 

Partial derivatives with respect to subspaces are defined for functions F at 

dF _ 

u in the usual way. When they exist, they may be denoted by {u), Fx{x, y), 

OTFy{x,y). 

When partial derivatives dF’’/duj{u) exist at u, then F'{u) denotes the 
p X n Jacobian matrix: 



rdF^u) 


aFi(u)] 


du\ 


dUn 


dFP{u) 


dFP{u) 


- dui 


dUn - 



(132) 



When a Frechet derivative Fu{u) exists, as a linear transformation it is 
represented, with respect to the standard bases of IR^ by the matrix F'{u), and 
so, for any z G IR^: 



When F is Frechet differentiable at u, then F'{u)z is a 
matrix representation of the vector Fu{u)z. 



(133) 



We will often split the space IR^ into two factors, IR^ = IFF x iR®, and 
consider X ^ FF and Y ^FF with ix,y) e XxY. Then when F: XxY 
FF and x is a limit point of X, if F has partial derivatives at (x, y) with respect 
to the X variables, then the corresponding qxr Jacobian matrix is denoted by 
F^(x, y)\ if F( • , y) has a Frechet derivative at x, it is denoted by Fx{x^ y). 
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Analogously, when g is a limit point of Y and F has partial derivatives at 
(x, y) with respect to the V variables, then the corresponding q x s Jacobian 
matrix is denoted by Fy{x, y); if F(x, • ) has a Frechet derivative at x, it (or 
any representative) is denoted by Fy{x, y). 

If A and B are linear subspaces of iR^, and if every element u of can 
be written uniquely as a sum of elements in A and B: u == x Ay, where x e A 
and y E B, then we say that jFF is the direct sum of A and B, and we write: 

]R^ = AaB. (134) 



All norms in a finite dimensional linear vector space lead to the same no- 
tions of convergence and differentiability; as convenient, we may use: for any 
V e m\ 



the Euclidean norm: ||x|l = y + ' * * + (135a) 

iht maximum norm: ||x|| = max{|xi|, . . . , |x/|} (135b) 

the sum norm: for any normed subspaces A and B, (135c) 
if v = aAbeAAB & a e A & b e B, 
then ||x|| = ||a|| + || 6 ||. 



For any v e and any real 7 , we denote the closed 7 -ball in IR^ about v 
by: 



B^ {v) = {v A w e IR^ : ||u;|| ^ ^}, (136) 



or simply B^{v). 

For X and y in IR^: 



All 


means Xi ^ yi for alH = 1, . . 


. , n 




x>y 


means x ^ y and x y 




(137) 


x>y 


means Xi > for all i = 1 , . . 


.,n. 





For an open set U g IR"^ and for / = . . . , IR"^, the notation 

/ ^ 0 means /"(xi, . . . , x^) ^ 0 for each i = 1 , . . . , g and for each i = 

For any subset 5 of 
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ch(5) = the convex hull of S (138a) 

— “1“ ' “h tm^m • ^ ^ ^0 5 • • • ? ^ ^ 

& • • • 5 = 0 ^0 + ^1 + ' • • + == 1 } 

cl(5) = the topological closure of S (138b) 

cone(5') = the conical closure of S (138c) 

= {tx : X e S k real f ^ 0} 

wedge(S') = cone(ch(S')) (138d) 

== the convex cone generated by S 

span(5) = the linear subspace generated by S. (138e) 



For any subsets S and T of : 

S + T = {x + y: X e S & y eT} 

5 + 0 - 5 = 0 + 5 . 



(139) 



Elements x £ IR^ will be treated as column vectors for matrix multiplica- 
tion; then we denote the transpose by which we treat as a row vector. 

The tangent cone. To describe derivatives of paths, we use the tangent 
cone. For any set S ^ with fZ G 5, a vector v G IRJ^ is tangent to S at u 
if: 

either: 

a) there exists a sequence of points u'^ £ S such that: (140a) 

i) u'^ ^ u for all i = 1, 2, 3, 

ii) \\u'^ — u\\ — ^ 0 

'' i _ '' 2-^00 



or: 



b) u is an isolated point of S and v = 0. (140b) 

We define: 

TuS = {rv £ : v is tangent to S' at fZ 

^ ^ ^ (141) 

& r is a nonnegative real number}. 

As this is clearly a cone, T^S is called the tangent cone of S at u. (For an 
interpretation by paths, see the Proposition on Paths and Tangent Cones below.) 

Algebra. Our understanding of constraint qualifications, and our proofs of 
Lagrange regularity rest on an algebraic theorem of the alternative. 

Theorem 5. Theorem of the alternative. Let A, B, and C be matrices whose 
components are from IR. Suppose that 
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A has a rows and n columns, and a G 
B has b rows and n columns, and (3 G IR^, 

C has c rows and n columns, and 7 G IR^, 

where 0 < a G IN, 0 < 6 G IN, 0 < c G IN, and 0 < n G IN. 

Then: 

I) Exactly one of (1) or (2) is true: 

1) There exists r = (ri, r^) G IR^ solving: 

Ar > a 
Cr — 7 . 

2) There exists u G IR^, v G IR^, and z G IR^, such that both (a) 
and (b) hold: 

a) u^A + B + z^C = 0 

b ) either (i) or ( ii ) hold: 

i) u>0 & & u^a + [3 + 2:^7 ^ 0 

or 

ii) u — It & ^ 0 & (3 + z^y > 0. 

II) When some, but not all, of (A, a) or (5, /3) or (C, 7 ) are not present, 
then the same alternatives (1) and (2) in part (I) above hold with these 
modifications: 

In (1,1): 

remove the first row if A and a are not present; 
remove the second row if B and f3 are not present; 
remove the first row if C and 7 are not present. 

In (2, a): 

set A = 0 and u = 0 if A and a are not present; 
set B = D and v = I if B and (3 are not present; 
set C — I and z = 0 if C and 7 are not present. 

In (2,b): 

only case (ii) occurs if A and a are not present; 
set [3 = 0 and v = 0 if B and (3 are not present; 
set^ — I and z — I if C and 7 are not present. 

The theorem can be proved from the Transposition Theorem of Motzkin 
[35], [33], pp. 28(2)-29, which is a homogeneous version (equivalent to setting 
a = 0, /? = 0, and 7 = 0). A proof by Fourier elimination, (cf. [39], pp. 1-20, 
[28]) shows that it applies to arbitrary ordered fields. Cf. also [34], p. 181. 

Analysis. We use an implicit function theorem with weaker than standard 
differentiability hypotheses. This is a special case of Theorem 1 of [21]. 
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Theorem 6. A Non-C^ implicit function theorem. Let X be a subset oflR^, 
let Y be a subset of IR^, and letU ^ X xY. Let u = {x^y) E U, where x E X 
is a limit point of X, and y EY. Suppose that X xY IR^ and: 



-ipix,y) =0; 

-0( • , is differentiable at (x, y), 

with partial derivatives v) cmd 'ipy{x^ y); 
'ipy{x^y) is surjective; i.e., 



(142a) 

(142b) 

(142c) 



det 



'dip^{x,y) 


di(j^{x,y) 


dyi 


dyk 


dip'^{x,y) 


dip'^{x,y) 


. dyi 


dyk 



Suppose that, for some ^ > \\{^|Jy{x^y)) ^)||; 

X e X and y e => 



^ 0 . 



67 



(142d) 



(x, y) eU and 'ip{x^ • ) is defined and continuous on B!f\\x\\ {v)- 



For any ^ > 0 and x E X, let X^ = (x) fl X 

Then: 



a) There exists a real ^ > 0 and a function <j) E 
that: for all x E X^, 



^{x,(j){x)) = 0 
0(x) = y. 



(143a) 

(143b) 



b) The S in part (a) can be chosen so that every function f E 
rixGX^ ^^\x\\ (y) satisfying (143) is differentiable at x. 

c) For every ^ > 0, all functions f E IIxgx^ ^^\\x\\^y) satisfying (143) 
that are differentiable at x have the same derivative fxi^): 

4>x{x) = - {fy{x, y)y^fx{x, y). (144) 

A path interpretation of the tangent cone. The tangent cone of a set can 
be interpreted as the set of derivatives of paths into the set. 

Let ft G S' ^ IR^, and suppose that ft is a limit point of S. Suppose 0 E 
T ^ [0, 1] and 0 is a limit point of T and (j): T S has <^(0) = u; then we 

By the norm of a linear transformation L, we mean II L II = max{||Lx||: ||x|| = 1}. 
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call (0, T) a path from u into S. If also f has a derivative v = 0'(O) G IR^ at 
0 G T, then we say that v is a. tangent of the path (0, T). We denote by Dn{S) 
the set of all tangents of paths from u into S : 

Du{S) = {7; G : 3(0, T) (0,T) is a path from u into 5 ^ = 0^(0)}- (145) 

Proposition on paths and tangent cones. If u e S ^ IR^ and if is a 

limit point of S', then: 

Tu{S)^Du{S). (146) 

So every tangent vector to S at is a tangent of a path from u into S, and vice 
versa. (Such paths need not be continuous.) 



8. Historical comments and comparisons 

The beginnings. Lagrange illustrated for one equality constraint, and sug- 
gested for multiple constraints a “general principle”:^^ 

When a function of several variables is to have a maximum 
or a minimum, and when there are one or more equations among 
these variables, it will suffice to add to the proposed function the 
functions which must vanish, each multiplied by an undetermined 
quantity, and then to seek the maximum or minimum as if the vari- 
ables were independent; the equations that one will find, combined 
with the given equations, will serve to determine all the unknowns. 

Euler [12] has been credited ([7], [9], [14], [27]) with originating a princi- 
ple (“Euler’s rule”), precursor of the Lagrange approach, for extremizing func- 
tionals subject to constraints. This was in the context of the calculus of varia- 
tions. 

We will sketch the history of constraint qualifications separately for Ja- 
cobian conditions and path conditions. To simplify the comparison, we focus 
primarily on inequality constraints.^^ 

8.1 Jacobian constraint qualifications 

We outline a sequence of increasingly weaker Jacobian constraint qualifica- 
tions. For simplicity, the constraint qualifications should be interpreted as ap- 
plying only to binding constraints. 

Our translation of the end of Section 58, in Chapter XI of the Second Part of [31]. 

An earlier exposition of the idea is contained in [10], Quatrieme Section, para- 
graphs 1-8, pages 44-49. 

We only discuss Lagrange regularity. See p. 103 for brief notes on quasi-regularity. 
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Classical. To obtain Lagrange multipliers, modem treatments of equality- 
constrained optimization have assumed the constraint functions are and 
have a Jacobian of maximum rank; this allows application of the classi- 
cal implicit function theorem, which uses a hypothesis. Cf. Bolza [7],^® 
Caratheodory [9], Bliss [5], [6], as well as many recent textbooks. 

1917. Hancock suggests handling constrained maximization problems with 
inequality constraints (“limitations”) by converting them into equality prob- 
lems [17], p. 150. A constraint g{x) ^ 0 would be replaced by the equality 
g{x) + = 0. 

1937. Valentine uses the same conversion device in treating a calculus of 
variations minimization problem with inequality side conditions [41]. He also 
determines the sign of the Lagrange coefficients for the inequality constraints. 

1939. Karush’s Master of Science dissertation [25] went unnoticed for 
many years, although it contains most of the basic concepts and many of the 
results of later work [25]. He uses the same squared slack variable device as 
Valentine had^^ to prove his Theorem 3:1 (pp. 1 1-13), which proved Lagrange 
quasi-regularity for the inequalities case. Following his Theorem 3:1, which 
proves Lagrange quasi-regularity for the inequalities case, he notes that with 
“normality” (the classical rank condition) and constraints, the Lagrange 
equations would follow; and if the constraints were the multipliers would 
be of the usual sign.^^ 

Karush’s Theorem 3:3 shows that the positivity condition 3^[^'(ii)^ > 0] 
of (24a) is sufficient for Lagrange inequality-regularity, assuming that all the 
constraints are C^. 

1953. Pennisi gives constraint qualifications for the mixed problem [36]. 
Assuming that all constraints are C^, are effective, and are satisfied with equal- 
ity at u, he obtains Lagrange regularity for the mixed problem by assuming that 
the matrix {g'{u), h\u)) has full rank (i.e., m -h k). 

1956. Arrow and Hurwicz obtain Lagrange mixed-regularity, using the 
maximum rank condition under assumptions [2]. 

1961. Arrow, Hurwicz, and Uzawa independently discover Karush’s pos- 
itivity condition 3^[^'(i2)^ > 0] (cf. (24a)). Their Theorem 3 shows it is suf- 
ficient for Lagrange inequality-regularity. (The centrality of this condition is 
witnessed not only by its independent rediscovery, but by the fact that, by our 

Bolza’s formulation makes clear that is a sufficient smoothness condition for 

Lagrange regularity. Earlier contributions appear to postulate real-analyticity, as in 

Weierstrass [43]. 

Neither Valentine nor Karush seem to refer to Hancock, and Karush does not refer 

to Valentine. 

He remarks (p. 13) that it is possible to reduce the assumption to . 
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Theorems l.B and 2.B, it falls short of minimality only in lacking the alterna- 
tive (24b).) The smoothness assumptions on the constraints g were reduced to 
differentiability at u. 

1967. Mangasarian and Fromovitz consider the mixed problem [32]. The 
smoothness assumptions of Pennisi are reduced from to and Pennisi’s 
full rank condition is reduced to the classical full rank condition on just h'{u), 
together with the existence (as in (23a) above) of a vector ^ such that g'{u)^ > 
0 and h'{u)^ — 0. 

1969. Mangasarian [33], p. 173, Theorem 6, part (iv) also considers the 
mixed problem, obtaining Lagrange mixed-regularity while reducing the 
constraints on g to differentiability at u, but maintaining the hypothesis on 
h. (In Theorem l.A above, the smoothness is reduced still further.) 

1974. Halkin [16], p. 235, section 3 weakens the smoothness condi- 
tions used by earlier writers for their constraint functions, and he allows the 
underlying space X to be a not-necessarily-finite-dimensional normed linear 
space. He obtains quasi-regularity results through a new proof of the implicit 
function theorem using differentiability, continuity, and Brouwer’s fixed point 
theorem, rather than the usual hypothesis and the contraction mapping the- 
orem. Unaware of his earlier result, we embarked on a similar quest, using very 
similar techniques [22]. The lighter continuity assumptions we present here in 
Theorem 1+ (for finite dimensional spaces) however, provide some extensions 
beyond the “Multiplier Rule” in his Section 3. 

While our Theorem 1 uses essentially the same smoothness hypotheses for 
constraint functions as his Multiplier Rule (the classes Qd and Hdc^ requir- 
ing continuity in a neighborhood), our Theorem 1+ weakens even further the 
continuity hypotheses for the equality constraints (allowing discontinuities in 
every neighborhood) as shown in our Example 4.2.^^ (Our Example 4.3 shows 
that no weaker continuity condition of our type would suffice for Lagrange 
regularity.) 

His multiplier conclusion is just Lagrange quasi-regularity, rather than the 
full regularity conclusion of our Theorems 1, 1+, 2, 3, and 4. The weaker quasi- 
regularity conclusion does not require any constraint qualifications, but it is 
less informative than full regularity in two respects: it does not tell when the 



Halkin ’s ability to weaken continuity assumptions for equality constraints below 
the properties of earlier authors is allowed by his introduction of an implicit 
function theorem with weaker assumptions properties [16], Theorem D than earlier 
authors had used. Our ability to weaken them still further is allowed by our Non- 
Implicit Function Theorem (page 142), which though similar has even weaker 
smoothness assumptions. Cf. [21]. 
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coefficient Aq in (13) is nonzero, and it does not say that the Xi coefficients for 
the nonbinding gi in (10b) are zero 

Halkin allows the underlying space X to be an arbitrary normed linear 
space, rather than the finite dimensional spaces we postulate. Examination of 
the proof (in [21]) of our Non-C^ Implicit Function Theorem, and the applica- 
tions we make of it in our Theorems 1 and suggest that a similar extension 
of all our theorems is possible.^^ 

Finally, Halkin discusses only sufficiency of his conditions, rather than the 
minimality notions of our Theorem 2 and the necessity conditions of Theo- 
rem 4. 



1995. Minimality. Summarizing, we have: 

rank(^^'(^)) = m (Karush (C^), Pennisi (C^), A-H (C^), A-H-U (D^)) 

^ * 

> 0] (Karush (C^), A-H-U (D^)) 



^ * 

[3^[gi\u)^ > 0]] or [wedge(a(l), . . . , a{p)) = iR^] 

(Theorem l.B above). 



(147) 



How much further does the weakening chain go? Theorem 2 says the chain 
stops here, since the last property is a minimal sufficient condition. Parts (A) 
and (C) of Theorem 2 show that analogous minimality results hold for La- 
grange mixed-regularity and equality-regularity. 



8.2 Path constraint qualifications 

We outline a sequence of increasingly weaker path constraint qualifications. 

1939. Karush obtains Lagrange inequality regularity in his Theorem 3:2, 
assuming constraints and imposing a path condition (“Property Q”) that we 
can write as: L{g) ^ A{C{g))c^, denoting by A{C{g))c^ the set of deriva- 
tives (i.e. directions) of “arcs,” or paths from u into the constraint set. 



Property Ao 7 ^ 0 would follow under the special Jacobian rank condition (31) of 
Corollary la: otherwise the quasi-regularity condition (13) would violate the quasi- 
regularity condition (Ao, A, g) 7^ 0 . 

Theorem V.3.3.2 in Hurwicz [20] establishes a full Lagrange regularity (“quasi- 
saddle-point”) for constraints in Banach spaces, using an infinite-dimensional ex- 
tension of the Kuhn-Tucker constraint qualification. 

Ioffe and Tihomirov [23] contains extensions of these and other results to infi- 
nite dimensional spaces, as well as references to other literature. 
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1951. Kuhn and Hicker’s Theorem 1 independently retraces part of the 
path traveled earlier by Karush. They prove Lagrange inequality-regularity us- 
ing the weaker assumption of differentiability for constraints, rather than C^. 
The Kuhn-Tucker “constraint qualification” says L{g) Q A{C{g))]ji, where 
A{C{g)) ]ji is the set of derivatives of paths from u into the constraint set 
that are differentiable at u [29] They provide more information about the 
Lagrange multipliers (analogous to our condition (10b)). Kuhn later became 
aware of Karush’s work and earlier references to it ([11], [36], [40], et al.), 
and related research by others. See [30] for his very informative historical ac- 
count, and some general comments on the several quite divergent interests that 
converged in similar theorems. 

1961. Arrow, Hurwicz, and Uzawa weaken the constraint qualification 
further, assuming only that L{g) ^ c\{ch{A{C{g))o^))- Calling this “Con- 
straint Qualification LL,” their Theorem 1 proves it is sufficient for Lagrange 
inequality-regularity. 

1967. Abadie weakens the Kuhn-Tucker Constraint Qualification in a dif- 
ferent way. He assumes that L{g) = Tu{C{g)) and that all the constraints 
were differentiable at u; and he proves Theorem 4 of [1], Lagrange mixed- 
regularity.^^ 

1971. Gould and Tolle weaken the constraint qualifications of both Abadie 
and Arrow, Hurwicz, and Uzawa. They assume A{C{g)) jji = cl(ch(T^(C( 5 )))) 
and prove Lagrange mixed-regularity, assuming both differentiability and local 
continuity of the constraints at u. This applies the closure and convex hull oper- 
ations used by Arrow, Hurwicz, and Uzawa, but to the tangent cone Tu{C{g)) 
used by Abadie [1], rather than to the smaller set A{C{g)) ]ji of [4]. 

1995. The Tangency-path Path Criterion of page 124 above weakens the 
constraint qualification further. Requiring just existence of partial derivatives 
at u, it requires only L{g) ^ cl(ch(Tti(C(^)))). The stronger equality prop- 
erty used by Gould and Tolle follows automatically with their differentiability 
assumptions (according to Abadie’s Lemma 4), but not with our weaker partial 
differentiability assumptions, as can be seen from the following example:^^ 

Example 8.1. 



Their main results are formulated for the case of inequalities, where the variables 
X are also required to be nonnegative. Certain other types of constraints are briefly 
dealt with in their Section 8. 

Actually the proof of his Theorem 4 only uses A(C(^))^i Q Tu{C(g)), but under 
his differentiability assumption his Lemma 4 shows that the inclusion automatically 
holds in the opposite direction. 

See also (75), p. 126, and (76), p. 126. 
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xi, for X 2 = 0 
0, otherwise. 

(148) 

X 2 , for xi = 0 
0, otherwise. 

This pair fails to satisfy the Frechet differentiability and continuity 

conditions that Gould and Tolle impose, as well as their L{g) = V{g) con- 
straint qualification. Nevertheless, it is partially differentiable at the origin 
(0,0), and satisfies our criterion (72) since L{g) G V{g)J^ and hence is La- 
grange inequality-regular. 

We developed our Tangency-Path Criterion^® based in part on Hestenes’ 
use of the tangent cone ([18], pp. 25 ff., [19], pp. 203 ff.) and in part by anal- 
ogy with Constraint Qualification W of Arrow, Hurwicz, and Uzawa [4]. Sub- 
sequently we discovered the paper by Gould and Tolle [15] with its closely 
related, but stronger constraint qualification. 

Despite the stronger hypotheses of Gould and Tolle, and their more de- 
manding constraint qualification, the proofs we had developed for our Theo- 
rems 3 and 4 were similar to certain parts of Gould and Tolle’s proof. 

Summarizing, we have: 

L{g) g A{C{g))c^ (Karush (C^)) 

L{g)g A{C(g))Di (Kuhn-Tucker (D^)) 

^ fcl(ch(^(C(5))z50) (A-H-U(Z)I)) 

- \Tfi(C(g)) (Abadie (Di)) 

L{g) = cl{chiTu{C{g)))) (Gould-Tolle (D^)) 

4): (Example 8.1) 

L{g) ^ cl{ch{Tu{C{g)))) (our Theorem 3 (partial derivatives)). 

(149) 

How much further does the weakening chain go? Theorem 4 says the chain 
stops here, since the last property is necessary for Lagrange inequality-regularity. 

Note that p'(0, 0) is the identity matrix, so L{g) is the nonnegative quadrant of IR^. 

Since C{g) is all of IB? except the negative rays of the axes, V{g) includes much 

more than the nonnegative quadrant, so our Theorem 3 implies that g is Lagrange 

inequality-regular. 

Page 124. 



g^{xi,X2) = 
g^{xi,X2) = 
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Parts (A) and (C) of Theorem 4 show that analogous necessity results hold for 
Lagrange mixed-regularity and equality-regularity. 

The tangent cone was applied to optimization problems by Hestenes [18], 
[19], Abadie [1], and Varaiya [42]. It was defined by Bouligand [8], paragraph 
68, pp. 65-66 (as the contingent set). 
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1. Introduction 

Let {J^t}te[o,oo)^P) be a filtered space with the usual condition, and 

{Bt}te[o,oo) he a d-dimensinal .Ft -Brownian motion. Let T > 0, and let a : 
[0, T] X (g)R^ and b : [0, T] x R^ ^ R^ be continuous functions. 

For each s € [0, T] and x G R^, let X{t; s,x)^t G [s, T] be a solution of the 
following SDE. 

X{t;s,x) = x-G f a{r,X{r; s,x))dBrG~ ( b{r, X{r]t,x))dr, t G [s,T]. 
J s J s 

( 1 ) 

We assume that the above SDE (1) has a path-wise unique solution for every 

(5,x) e]o,T] X R^. 

Let 0 < 5 < t < T, be the set of FfStopping times r with s < r <t. 
Let g : [0, T] x R^ R be a continuous function with suitable condi- 
tions. Then, concerning the pricing of American derivatives, we are interested 
in computing the following value function 

u{s,x) = sup{E[g{T,X{T-,s,x))];T e Sj},{s,x) e [0,T] x R^. 
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There are several attempts to compute the value function u numerically. How- 
ever, it seems that there are not so good method if D is not small. Let N > 2 
and let Tn, n = 0, 1, . . . , AT, be positive numbers such that 0 = To < Ti < 

. . . < Tn = T. Let Sn, n = 0, 1, . . . , be the set of .FfStopping times 
taking value in {T^, T^+i, . . . , Tjv}. Concerning the pricing of Bermuda type 
derivatives, we are interested in computing the following value functions. 

Vn{x) = sup{E[g{T,X{r; 5, x))]; r G 5^}, n = 0, 1, . . . , A^. 

Letus define a probability measure Pn(x, •) overR^ foreachn = 0, 1, . . . , N- 
1, and X e by 

Pn{x,A) = P(X(Tn+i;Tn,x) G A), foraBorelset AinR^, 
and define an operator n = 0, 1, . . . , A/' — 1, by 

Pnf{x) = [ f{y)Pn{x, dy) = P[/(X(r„+i; x))] 

for a measurable function / on R^. Then ^ ^ - 1, • • • , 0, are given 

inductively by the following. 



vn{x) = g(TN,x), 



Vn-l{x) = {Pn-lVn){x) \/ g{Tn-l, x). 

So the value function vq{x) is easily given mathematically. However, if D is 
not small, it is not easy to memorize a function on R^, and so practically it is 
not easy to compute ^;o(x). 

Several people suggest a Monte-Carlo method to compute the value func- 
tion ([1], [2], [5], [6]). In this paper, we discuss the method given by Longstaff- 
Schwartz [4]. 

Let Wn = n = 0,1,..., AT, and let a; 6 R^, be 

the distribution of {X{Tm', T„, on W„. Then Pi"^ , n = 0, 1, . . . , A?’, 

a: € R^, is a time inhomogeneous Markov chain on R^. Let i^o be a prob- 
ability measure over R^. Let Ln > l,n = 0, 1,...,AT-1, and 
= {Xn/{m)}^^o^ ^ — I5 • • • ? n = 0, 1, . . . , A^ - 1, are identically inde- 
pendent random vectors defined on the probability measure (fi, P, P) whose 
distribution is P^q^ = Pi°Vo(dx). Let > 1, ^ = 0, 1, . . . , A^ - 1, and 
fc = 1, . . . , Kn^ n = 0, 1, . . . , A/” — 1, are functions on R^. 

We assume the following assumption (Al). 

(Al)Pn , n == 0, 1, . . . , W-1, are measurable sets in R^ such that {PnVn-{-i){x) > 
g{Tn, x) for any x G R^ \ Pn- 
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Remark 1 {\) Dn= satisfies the assumption (Al). 

> 0,forany (t,x) G [0,T]xR, thenL)^ = {x e R^;^(T^,x) > 
0} satisfies the assumption (Al). 

We define random functions : R^ x O R, n = — 1, . . . , 1, 0, 

inductively by the following. 



Hn{x) = 1. 

When Hn+i = {Hm}^=n+n are defined, we let 

an,£ = min{m > n + 1; Hm{Xn/{m)) >0}, i = l,...Lr, 
and let {an,k}k=i minimizing point of the function 

^ £=1 
Kn 

k=l 

Finally we define Hn by 

^ OiAni x) — ^n,k'^n^k{x) X G Dri 



xeR^\Dr 



Let fio : ^ R be given by 



1 

vq = J- Xo/icfoA) 



£=1 



In the present paper, we discuss the estimate on A[(i)o V g{0,xo) 
xo{xq))‘^ a 1] when i/o{dx) = Sx^{dx) (see Corollary 3.2). 



2. Preliminary results 

Let Wn = R(^+i-")^, n = 0, 1, . . . , AT, and let Pi"\ x G R^, be the 
distribution of {X[Tra\ Tn, x)}^_„ on Wn- Then Pi"\ n = 0, 1, . . . , A^, x G 
R^, is a Markov chain on R^. 

For any measurable function h on R^ and n,m = 0,1,. .. ,N with n < 
m, let r„(-; h) :Wn-^ {m, N} by 

^ h{w{m)) > 0, 
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Lemma 2 Let hn : R, n = 0, 1, . . . , A^, given, and assume that 

hn{x) < 0, X G R^ \ Dn, and that hjs[{x) = 1. Let an : Wn — ^ {n,n + 
1, . . . , A^} be given by 



AT-l 

(J-niw) = an(w; {hm}mZh) = f\ Tm{w\ hm), W € W„. 

m=n 

Moreover, let Un : R^ —^^be given by 

Unix) = Unix-,{hm}m=n) = E^^"\9{Ta„,w{an))], X S R^, 

Then we have the following. 

(1) \un{x) - Vn{x)\ < |P„(u„+i - v„+i)(a;)| + lD„(a;)|P„u„+i(x) 

— {g{Tn,x) — hn{x))\for any n = 0,1, . . . , N — I, and x € R^. 

(2) \un{x) - Vn{x)\ < |P„(u„+i - v„+i) (x)| + lz3„ (x)l{i} (5^n(P„w„+i (a;) 
-9iTn,x))sgn{hn{x)))\PnUn+i{x) -g{n,x)\. 



Here 



j 1, a > 0, 

sgn{a) = < 0, a = 0, 

I -1, a < 0. 



Proof Note that Un{x) < Vn(x), for all n = 0, 1, . . . , A^” - 1, and x € R^. Let 

Un(x) = g(Tn,x) - hn(x), X E R^. 

Let n = 0, 1, . . . , N' - 1, and x e R^, and fix them for a while. 

Case 1. Suppose that hn(x) > 0. 

Then we see that x G Dn and g(Tn, x) > Un(x). So we have 



yn(x) = g(Tn,x)-h(PnVn+l(x)-g(Tn,x))V0 < g(Tn, x)-hlPnVn+l(x)-Un(x)l. 

This implies 



g{Tn,x) > Vn(x) - |Pn(^n+l -^n+l)(^)| “ |Pn^n+l(^) ~ Un(x)l. 

Case 2. Suppose that hn(x) < 0, and x e Dn- 
Then we see that g{Tn, x) < Un{x). So we see that 

Vn{x) < PnVn-i-l{x) V Un{x) 

^ -^n^n+l(^) P \Pn(d^n-\-l ^n+l)(^)| “1“ |-^n^n+l(^) 

Case 3. Suppose that hn{x) < 0, and x G R^ \ Dn- 
Then we see that g(Tn^x) < {PnVn-\-i){x). So we have 



Xn{x) — PnVn-\-l{x^ < PnUn-{-l{x^ + \P n(^Un-\-l '^n+l)(^)|* 
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So we see that for any n = 0, 1, . . . , - 1, 

— ^{hn>0}{^n 1-Prt(^n+1 ^n+l)| l-^n^n+l "^n]) 

<0} ^Dn {^n \Pn ('^n+1 't^n+1 ) 1 ~ \Pn'^n-j-l '^n\) 

+ l{/in<0}lR^\Dn('^ri - |^n('^n+l ~ '^n+l)|)- 

Thus we see that 

0 ^ ^ |-^ n(^n+l ^n+l)| H“ |lD^-fn^n+l "^nl* 

This implies the assertion (1). 

Now let us prove the assertion (2). Let ^ is a positive measurable function 
on R^. Since Tn{w; ^hn) — Tn{w; hn), we see from the assertion (1) that 

\Un{x) - Vn{x)\ < |Pn('^^n+l " Vn+l){x)\ 

^Dn (^) |-^n^n+l (^) 9{'^n-) ^(^)^n(^) | • 

Noting that 

inf{a4-t6; t > 0} — l^ij{sgn{a)sgn{b))\a\, a,6GR, 
we have the assertion (2). 

This completes the proof. □ 

Let uq be a probability measure on R^ and define probability measures 

— 1 , . . . , A^, inductively by 

iyn+i{dx) ^ / Pn[y\dx)un{dy), n = 0, 1, . . . , A^ - 1. 

JnD 

Then we have the following as an easy consequence of Lemma 2. 1 . 

Corollary 3 Let hn and Un be the same as the previous lemma. Then we have 
the following. 

(/ \Un{x)-Vn{x)fVn{dx)y^'^<\{ |u„+i -t;„+i (x) (o!x)) 

+ f \PnUn+l{x)-{g{Tn,x)~hn{x))\^y^^ 
JDn 

for any n = 0, 1, . . . , A^ — 1. 
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3. Main result 



Let US think of the situation in Introduction. 

We assume the following additionally to (A-1). 

(A2) k = 1, . . . , Kn, is linear ly independent in L‘^{Dn] dun), ^ 
0, 1, . . . , W — 1, where is the probability law of w{n) under {dw). 

(A3) / 'ipn,ki^)'^^n{dx) < oc A: = 1, . . . , Kn, n 0, 1, . . . , W - 1, 

Jor, 

and 



r ^ 

/ 9{T„,w{Tn))*]i^o{dx) < oo, 

m=l 



n = 0, 1, . . . , W. 



Let Un{x) = Un{'', {i/rn}m=n))(^)‘ '^n is as in Lemma 2.1. Let 

{dn,k]k=i minimizing point of the function 



Kr, 



n -*'■ n 

Fn{{ak}k=l)= / \{PnUn+l){x) -'^an‘4)nA^))\‘^’^n{dx). 

fc=l 



'Dn 

Then we have the following. 

Theorem 4 (1) There is a constant C > 0 such that 

Kn 

fe=l 



c 



( 2 ) 

( / \Un{x) - Vn{x)\'^l'n{dx))^^^ 

< ( / \Un+l{x) - Vn+l(x)\'^V„+i{dx)y^^ 

JH.D 

. Kn 

“f( / ^7i,fc)^n,/c(^)) t^nidxy^ ^ 

k=l 

n Kn 

+ inf{( / \{PnUn+l){x) ~'^aktpnA^)?^n{dx)Y^‘^-, dfc G R.5 

fc=l 

A: = 1, . . . , ATn}. 

Frc?6>/ Let ^ = 0, 1, . . . , A' - 1, be the cr-algebra generated by Xn,^, ^ = 
1, . . . , Ln, and let >Bn, n = 0, 1, . . . , A — 1, be the cr-algebra generated by 
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Inductively, we see that Hn is -measurable, n = — — 

2, . . . , 0. Also, we have 



Kn 



Kr^ 



^’n({aaf=l)= E 



k,k' = l 



k=l 



where 



^ Lri 

^n,k,k' ~ Y~ 



^n^k ~ T '^^{^Dni^n,k){^n,£)9{T(j^ ^^Xn/{crn £)). 



e=i 



Note that (7^,^ = an+i(Xn,£(-); {Hm}^=n+i )- Therefore we have 



^(2) pP 

^n,k,k' ~ ^ 



and 






[Enlk'\En+l]= [ '(pn,ki^)'>Pn,k'ix)Un{dx), (2) 

%\En+l] = f 1pn,k{x){PnUn+l)ix)Un{dx). (3) 

’ JDn 



T pt = ^(2) __ 

L^Cl ^n^k,k' '^n,k,k' 



and rl^l = Also, let 

r^(2) \D ^(2) r^(2) \D 

“ l'^n,/c,fcM/c,/c' = l’ “ \^n,k,k' Jk,k' = l^ l^n,fc,/c' J /c,/c' = l ’ 

= {Cn,fe}£=i> and Then andi?i^^ 

are are D x D random matrices, Cn \ Cn^ and are D-dimesional random 
vectors, and is a nonrandom D x D matrix. 

Then we see that 



~ _ ^(2)-l (1) - _ ^(2)-l-(l) 



n = 0, . . . , AT — 1. 



Also, we see that 



EVnlk'f] 



J-'n 

- / ^n,k{xfi^n,k'{xf^n{dx), 

Tn J Dn 



and 






= —E[Var[lD„{Xn,l{n))^l)nA^n,l{n))g{an,l,Xn^l{(7„^l))\Bn+l]] 
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n J Vn 

w(t7n+i(w; {^fm}m=„+l)))^]('n(*:). 

U!,''l < !>!». 



So we have 
where 

= (/ 'l/’n,k(xf)l^n(dx)y''^ 

JDn k=l 



Kn 



if max g(r„,w(r„))2]t'o(dx))^/2. 

Jrd m=l,...,N 



also, we have 



K 

E^[\\ Rn'’ \?]< ^ j C^i^n,ki^ff^n{dx) ( 4 ) 

and 

K 

E^[\r\l^^\ < f-if 

-'Dn k=l 

if max o(T„,u;(T„))‘‘]z/o(<ia:))^/2_ (5) 

JjlD m=l,...,N 

If II Cn^~^Rn^ ||< 1/2, we have 

II (Cfl + /!?>)-■ -Cf-‘ II 

=11 ((I + - /)(?«-■ ii< 2 II c«-‘ nil fii?) II . 

Here || • || is the operator norm of a matrix. So if || ill II < 1/2 

and |ri^* | < 1, we have 

!«. - «„l = |((CP> + ii?))-' - cf i-'xc!.-' + r<‘>) + <;»>-x(«i 

< 2 II c®-‘ nil R|?' 11 (i4‘>i + 1)+ II Of-' II x<'i|. 

So we have 

E^[\cLn - d„|^ A 1] 

< ^^[|d„ - a„p, II C^)-1 nil i?(2) ||< 1/2, |r(i)| < 1] 

+P(||C^)-i ||||P(2) ||>l/2) + P(|rW|>l) 
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<(8(6^ + 1)2 ||C(2)-i f+4 ||C^)-i ||-2) 

E^[\\ i?(2) ||2] + (2 II c(2)-i ||2 +i)E^[|r(D|2]. 

This and Equations (4) and (5) imply the assertion (1). 

The assertion (2) is an easy consequence of Lemma 2.1. □ 

Let Vn = Ylk=i c L^(R^; dz^n), n = 0, 1, . . . , iV - 1. Then it is 

easy to see that are determined by i ^ l,...,Ln,n = 0,...,N 
and V^’s , and are independent of a choice of bases {’>pn,k}k=i - Let 

dn = inf{( / 'il)k{,xffvn{dx)f/‘^\ {^k)k=i is a orthogonal basis of V„}, 

and 

co = {[ max g{Tm,w{Tm))'^]i^o{dx))^^'^. 

Then we have the following from the proof of Theorem 3.1. 

Corollary 5 E[{ [ \Un{x) - Vn{x)\‘^Un{dx)) A 1]^/^ 

JUD 

< E[{ [ |[/„+i(x)-n„+i(x)|2r.„+i(dx))Al]i/2+8L;i/2^^(j^y2cV2^^) 

+-E[inf{( f \{PnUn+i){x) - ^{x)\‘^i'n{dx)); i> eVn} A 
JDr, 

Then we have the following. 

Corollary 6 Let vg be as in Introduction, and assume that uq = 6xq for some 
Xq G . Then we have 

E[{vo V ff(0,xo) - vo{xo)f A < Ei +£ 2 , 

where 

E, = Lo ^ 8(L„)-i/2d„(Ky2cy2 + 1) 

n=l 

and 

N-l . 

^2 = V -B[inf{( / \{PnUn+i){x) - V’(2;)|2r'„(da;)); -ipeVn} A 
^-1 dOn 
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Proof. Note that vq{xq) = {PiVi){xq) V ^(0, xq). We also see that 
E[g{T,,^„XoA^o,emi] = {PiUi)(xo). 

So we see that 

E[{vo - {P,Ui)(xo)f] < 

Since 

1(^0 V g{0, Xo) - •yo(a;o)| < l^o - (PiC/i)(a;o)| + |(-Pif/i)(a;o) - (-PiVi)(a:o)| 

< |vo - (-Pi?7i)(a;o)| + ( [ \Ui{x) - vi{x)fi/i{dx)y^^, 

JRO 

we have our assertion from Corollary 3.1. 

This completes the proof. □ 

Remark 7 We may call E\ and E 2 in Corollary 3.2 the simulation error and the 
approximation error, since Ei is an error caused by Monte Carlo simulation and 
E 2 is an error caused by the approximation by systems of functions. Ei tends 
to zero as Ln tends to infinity. 

We will study how small E 2 is in the special case in the next section. 



4. Relation to hypoellipticity 

Let V"o, Vi, . . . , Vrf C C^(R^; R^). Here C^(R^; R^) denotes the space of 
R’^ -valued smooth functions defined in R^ whose derivatives of any order are 
bounded. We regard elements in C^(R^; R^) as vector fields on R^. 

Now let X{t,x)^te [0, 00 ), X e R^, be the solution to the Stratonovich 
stochastic integral equation 

d nt 

X{t,x)=x + Y^ Vi{X{s,x))odB\s). (6) 

i=o -^0 

Then there is a unique solution to this equation. We think of the SDE (6) instead 
of the SDE (1) as the SDE for the underlying stochastic process. 

Let .4 = {0}U|J^j{0, 1, . . . ,d}* and for a e A, let |a| = Oif a = 0, let 
|q!| = A: if a = (a^, . . . , G {0, 1, . . . , d}*, and let || a |1 = |a| + card{l < 
i < |a|; a* = 0}. Let and Ai denote A \ {0} and A \ {0, 0}, respectively. 
Also, for each m > 1, A(rn), Ao(m) and Ai(m) denote {a G A; || a ||< m}, 
{a € Aq', II a ||< m} and {a G A\\ || a ||< m} respectively. 

We define vector fields Vjo,] ,aGA, inductively by 

V[0] = 0, V[i] = Vi, f = 0, 1, . . . ,d, 

= [L[a], Lj], t = 0, l,...,d. 
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Definition 8 say that a system { Vi ; z = 0, 1 , . . . , d} of vector fields satisfies 
the condition (UFG), if there are an integer i and C C^(R^), a e 
(3 G satisfying the following. 

^ a e Ai. 

peAi(i) 

We assume that a system {Vi; z = 0, 1, . . . , d} of vector fields satisfies the 
condition (UFG) throughout, and let £ be as in Definition 4. 1. . Let Aq : — > 

[0, oc) be a continuous function given by 

Ao(x) = inf{ ^ ICI = 1} xeR^. 

aeAi{i) 

Let us define a semigroup of linear operators {Pt}te[o,oo) t>y 

{Ptf)ix) = E[f{X{t,x))l t e [0,^), / e Cb(R^). 

Now let uq be a probability measure on R^ and let 1 ^ 8 , s e (0, oo), be a 
probability measure on R^ given by 

Ot{A) = [ oo{dx)P{X{t,x) e A), 

Jk^ 

for any Borel set A in R^. 

We assume the following, moreover. 

(AM) (1) There is an £ > 0 such that 

[ exp{\x\^^~^^'^)i'o{dx) < oo. 



(2) For any p G (1, oo) 



/ Ao(a:) ^oo{dx) < oo. 

Jn^ 

Let A : R^ [0, oo) be given by 

... f (traceA(x)~^)~^ , if Ao(x) > 0, 

= ifAoW = 0, 



Then we can easily see that 

A^~^Ao(x) < X{x) < Ao(x), X G R^. 
Also, we have the following (see [3]). 
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Proposition 9 (1) For any n > 1 and i\,. . . ,in G {1, . • . , N}, there is a 
(7 > 0 such that 



dn 



• • • dx'^^ 



-A(a;)| < a; € R^. 



(2) For any n>l and ii , . . . , G {1, . . . , D} there is a constant C > 0 such 
that 

for any bounded measurable function f : R and t G (0, 1]. 



(3) For any p e (l,oo) and 5 > 0, 

[ Xo{x)~^Us{dx) 
JRD 



< CX). 



Let Vn^n> 1, be the set of polynomials on R^ of degree at most n. Then 
we have the following. 

Theorem 10 (1) Let 5 = (e A l)/2. Then 

I exp{\x\^^~^^^)iys{dx) <00, 5 > 0. 

Jk^ 

(2) For each t, 5 > 0, n > 1, and a bounded mesurable function f : R^ R 
let 

dn{f-,t,s) =inf{( / |Pt/(a;) - g e Vn}- 

Also, let 



dn{t^s) = sup {dn if] t^s); f is a bounded mesurable function with \ f\ < 1}. 

Then n^dn{t, s) 0, as n oo,for any t > 0 and s >0. 

Proof We need some preparations to prove Theorem 4.1. The assertion (1) 
is obvious, since there is a constant a > 0 for each 5 > 0 such that 

sup{E[exp(a|X(s,x) -xp)]; x G R^} < oc. 

Let us prove the assertion (2). First we fix ^ > 0 and 5 > 0. Let S be the 
set of / € with || / ||oo< 1. 

Let '0 G C^(R) be such that 0 < 0 < 1, 0(z) = 1, z < 1, and ^|J(z) = 0, 
z >2. Also, let (fr ^ C'^(R^), r > 1, be given by 

(Pr{x) — 0(r“^|a;|)(l — 0(rA(a;)), x G R^. 

Then for each k >0 there is a constant Ci^k such that 




Monte Carlo method for pricing of Bermuda type derivatives 



165 



\W^iPr{x)\ < Ci,fcr''l(i,oo)(?’A(a;))l[0,2r)(|2;|), X e R^. 
So we see that for each A: > 0 there is a constant C 2 ^k such that 

\V\^rPif){x)\ < C2,kr^l[0,2r){\x\), X G 

for any / C 5. 

Also from the assetion (1), we see that 




- (fr{x)\iys{dx) = o{r ^), 



r — » (X) 



for any p G [1, cxo). 
Let 



(7) 



( 8 ) 



/0r(^;/) = (271-) ^ ex-p{-ix-^)ipr{x){Ptf){x)dx, 

JRD 

^GR^,r>i,fe cr(R^). 

Thus we have 

ifr{x){Ptf){x) = f exp{ix-Opr{^-,f)d^. 

JRD 

Then by Equation (7), we see that for each k > 0 there is a constant 
such that 

|Prte/)|<C3.fcr^+2''(l + |C!')-" (9) 

for any / G 5. Let 7 = (5/4, 

9r,n{x'J) = / exp(ia;-0l[0.n2--l(l^l))Pr(?;/)c(C 

Jr^ 

and ^ 

Pr,n{x\f) = k\^ l[ 0 .n^^](ICl))pr(^;/)<(g 

Then Pr^n{x: f) is a polynomial of order n — 1. So we see that for each k > 0 
there is a constant C 4 /c such that 

snp{\ipr{x)iPtf){x) - gr,n{x\ /)|; X € R^} < (10) 

for any f ^ S. 

Note that 
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So we see that 



br,n(a;;/) -Pr,n(a:;/)| < [ f l[0,n^->](|$|))|Pr(g; /)MC 

Jrd ni 



< Csnr 



D 



n 



D-\-2'yn 



n 



for any f e S. Note that 

if |:r|2V,(dx))i/2 < (/ |x|2(i+^)V,(dx))i/(2(i+^)) 

Jrd Jrd 

< C4((2n)!)i/(2(i+5))^ 

where C 4 = (J^d Then we see that 

( [ l5r,n(a;;/) - Pr,n{x] f)\'^l^s{dx)y^^ 



< C4C3,o(rn)^^((2n)!)i/(2(i+^)). 



2'yn 



n\ 



Since 27 + (1 + ^) ^ < 1, we see that 

sup{( [ |5r,n(a;;/) - Pr,„{x; f)\‘^i's{dx)y^^-, f eS} = o{n~P), 

JR^ 



n 00 ( 11 ) 



for any p G (l,oo). 

Therefore Equations (7), (9), (10) and (11) imply our assertion. 
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Abstract. Wald was the first who solved the existence problem in economics. His paper 
should be regarded as a starting point of the axiomatization in economics. The move- 
ment toward an axiomatization in economics took place only in Vienna. We must not 
dismiss the intellectual backgrounds in Vienna, which could be classified under the fol- 
lowing headings; (a) the research activities in economics, (b) Hilbert’s philosophy on 
the foundations of mathematics, (c) the Vienna Circle’s philosophy of science, and (d) 
the development of convex analysis. In this paper, we shall examine how the interplay 
of these factors stimulated Wald to the existence proof. 
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1. Introduction 

Abraham Wald was the first who solved the existence problem in economics 
[53]. After a couple of years, John von Neumann succeeded in the existence 
proof of the balanced growth of a multi-sectoral economy [50]. The publication 
of these two papers should be regarded as a starting point of the axiomatization 
in economics. The papers by Wald and von Neumann were both published in 
the Ergebnisse eines Mathematischen Kolloquiums, which was an annual report 
of the colloquium organized by Karl Menger. He was a mathematician in the 
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field of geometry and a professor at the University of Vienna. Wald intended to 
enter the University of Vienna to study geometry under the direction of Menger. 
Menger soon appreciated Wald’s ability and invited him to his colloquium. 

Among other outstanding mathematicians of the days who participated in 
this colloquium, we must mention such big names as Kurt Godel, Hans Hahn, 
Georg Nobeling, Franz Alt and Stafford Beer. Wald, Alt, Nobeling and Godel 
later became co-editors of the Ergebnisse. In addition, Alfread Tarski, Bronis- 
law Knaster, Karol Borsuk and Stefan Mazurkiewicz were frequent guests. And 
von Neumann dropped in the colloquium when he passed through Vienna dur- 
ing his summer journey to Budapest. In fact, Menger’s colloquium constituted 
a sparkling constellation of big stars in the universe of mathematics. 

The movement toward an axiomatization in economics took place only in 
Vienna. This was not the result of pure chance. We must not dismiss the intel- 
lectual backgrounds in Vienna, which produced and promoted this movement. 
They might be classified under the following headings; (a) the research activi- 
ties in economics, (b) Hilbert’s philosophy on the foundations of mathematics, 
(c) the new trend in the philosophy of science, and (d) the development of con- 
vex analysis. Without a combination of these factors, the movement toward 
an axiomatization in economics would not have emerged. It was the interplay 
of these factors that stimulated Wald to the existence proof. At this point, Karl 
Menger played an active role as an organizer. He built the bridge between these 
factors. He was an exceptional person who was well acquainted with mathe- 
matics, philosophy of science and economics^ 

Menger became deeply interested in social sciences, in economics in par- 
ticular. In 1923, he edited the second edition of his father’s Grundsatze der 
Volkswirtschaftslehre and wrote an introduction to it. 

In his papers on the law of returns, Menger [28] examined alleged proofs of 
the law available in the economic literature. His papers were characterized by 
himself as a study in meta-economics^. He also participated in the Vienna Cir- 
cle and was fully aware of the new trend in the philosophy of science. Without 
him, mathematical economics would not have flourished in Vienna. 

In what follows, we are going to examine each of these factors and to clar- 
ify how the interplay of these factors motivated Wald to achieve the existence 
proof. In section 2, we give a brief sketch of the research activities in economics 
between the Wars. We explore the reason why the members in Menger’s collo- 
quium accepted the Walras-Cassel general equilibrium theory as a material of 
the discussion in the seminar. 

^ For more on K. Menger’s contribution to social sciences, see Golland, L. and K. 
Sigmund [16]. 

^ Menger ([28], p.280) said; “Following a suggestion of Hilbert, modern logicians 
refer to the study of the logical relations between the statements of a theory as the 
corresponding meta-theory. In this terminology, the contents of the present paper 
can be described as a chapter in meta-economics’". 
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Wald axiomatized the Walras-Cassel system and gave an existence proof 
to it. His primary concern for the existence problem reflected Hilbert’s philos- 
ophy on the foundations of mathematics. Hilbert insisted that the consistency 
of an axiomatic system was equivalent to the existence of the corresponding 
mathematical concept. 

Hilbert’s philosophy was mainly concerned with mathematics proper. On 
the other hand, it was in economics that Wald achieved the existence proof. In 
this respect, the Vienna Circle played a key role in connecting the philosophy 
advocated by Hilbert with all sciences in general. In the Circle’s view, all sci- 
ences must be pursued according to the same method. The Vienna Circle con- 
sidered that the axiomatic method is indispensable for this program and took a 
formalistic approach to an axiomatic system. In considering the emergence of 
Wald’s existence proof in economics, we have to distinguish the influence of 
Hilbert’s philosophy from that of the Vienna Circle’s philosophy. The former 
had an influence on Wald’s concern for the existence proof, while the latter had 
an influence on his concern for the axiomatic treatment of economic theory. If 
we ambiguously grasp the content of Hilbert’s formalism, we fail to make a 
clear distinction between the influence of Hilbert’s philosophy and that of the 
Vienna Circle’s philosophy of science. 

Then we define the content of Hilbert’s formalism in section 3. And we 
show how the Viennese mathematicians accepted Hilbert’s formalism. 

In section 4, we deal with the new trend in the philosophy of science in 
Vienna and study its relation with the axiomatic economic analysis. And we 
examine the view of unified science promoted by the Vienna Circle stimulated 
Wald to axiomatize economic theory. 

In section 5, we examine the mathematical structure of Wald’s proof. Wald 
achieved his existence proof in an elementary way. We are inclined to have an 
impression that Wald’s proof had not contained mathematical contents attract- 
ing the Viennese mathematicians’ interest. 

In Vienna, the mathematical tools of convex analysis were intensively stud- 
ied. Then we examine to what extent Wald’s proof linked up with the develop- 
ment of convex analysis in Vienna at that time. 

Wald’s model contains the system of linear inequalities. The theory of lin- 
ear inequalities is closely related to the Minkowski-Farkas lemma, which is 
often referred to as the separation theorem of convex sets. Although Wald did 
not mention the Minkowski-Farkas lemma, we can prove it based upon his 
idea of proof. We may conclude that his idea of proof contained substantially 
the mathematical contents of the same depth as the Minkowski-Farkas lemma. 

In addition, based upon his idea of proof, we can prove the duality the- 
orem in linear programming and the minimax theorem. These mathematical 
exercises may help us to evaluate the “depth” of Wald’s proof. The proofs are 
collected in the Appendix. 
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Finally, the summary of this paper is provided in the concluding remarks. 



2. Viennese view on CasseFs system 

It was the Walras-Cassel equation system that Wald gave an existence proof. 
We first present Cassel’s formulation. In his “Grundriss einer elementaren 
Preislehre” [10] and in his Theoretische Sozialokonomie [11], Cassel reformu- 
lated Walras’ general equilibrium theory as the following equation system. 

Suppose that there are n goods and m factors of production. The available 
quantities of factors are assumed to be given exogenously. We denote the de- 
mand for j-th good by Sj and the supply of i-th factor by respectively, where 
j = 1, 2, • • • , n and i — 1, 2, • • • , m. Let pj represent the price of j-th good 
and Qi the price of i-th factor respectively, p and q represent the column price 
vectors with component pj and qi respectively. 

Let aij be the amount of i-th factor required for the production of one 
unit of j-th good. We define A as the m by n matrix with elements aij. We 
denote by r an m-dimensional column vector with components and by s 
an n-dimensional column vector with components Sj. Cassel’s system can be 
summarized as follows. 

(1) the demand-supply conditions for factors: 



As = r. (1) 

(2) the price-cost conditions for goods^: 

*Aq = p. (2) 

(3) the demand functions for goods: 

Sj = ,Pn) O' = 1,2, ••• ,n). (3) 

Karl Menger introduced this theory in his colloquium and frequently dis- 
cussed it with the members. During the interwar period, the research activities 
gradually shifted from the university to the private seminar and the research 
institute (see E. Graver [13]). 

In 1920s, Hans Mayer’s seminar focused on the Zureschnungspwblem (the 
imputation problem), the problem of imputing the values of consumption goods 
to the factors of production. Karl Menger, Schlesinger, Wald and Morgenstem 
attended the Mayer’s seminar. 

In the early thirties, Ludwig von Mises’ seminar played an important role 
in the Viennese economics community. Among other economists who attended 



^ Here ^ A stands for the transposed matrix of A. 




Mathematical economics in Vienna between the Wars 



171 



the Mises’ seminar, we must mention such names as Friedrich von Hayek, 
Fritz Machlup and Gottfried Haberler. Karl Schlesinger was listed as one of 
the regular members of the seminar (see L. von Mises [48], p.lOO). The range 
of their discussions in the seminar was extended beyond economics. In his 
Notes and Recollections, Mises ([48], p.97) wrote; “we informally discussed 
all important problems of economics, social philosophy, sociology, logic, and 
the epistemology of the sciences of human action”. 

It would be plausible to infer that through the debate in the Mayer’s sem- 
inar, Menger, Schlesinger and eventually Wald became to be familiar with the 
Walras-Cassel equation system. 

Certainly, Cassel’s Theoretische Sozialokonomie was used as a textbook so 
widely in Central European countries (see E. R. Weintraub [57], p.4) that we 
can find several criticisms on this treatise. 

In 1926, Morgenstem wrote a review article on Cassel’s book, but it was 
never published. In addition, his article in the Encyclopaedia of the Social Sci- 
ences, Morgenstem ([34], p.367) argued that “Cassel took over Walras’ equa- 
tions in a simplified form, but in his presentation there are more equations than 
unknowns; that is, the conditions of equilibrium are overdetermined"^”. 

Edward Schams [42], who kept a close relationship with Morgenstem, ar- 
gued that one of Cassel’s equations was not independent. And Schams [42] 
condemned Cassel for his mathematical mistake^. 



^ Regrettably, we have no idea about what Morgenstem intended to mean by this 
passage. It seems to be a conundmm. 

^ Although he made no reference to Wicksell’s 1900 paper [58], it seems that 
Schams’ comment was based upon his reading of the paper. Wicksell was the 
first who refuted the Cassel’ s equation system as logically inconsistent. In addi- 
tion, Wicksell was the first who pointed out that the equality of the demand-supply 
conditions for factors (1) should be replaced by inequality. Wicksell [59] pointed 
out that Cassel’s equation system had more unknowns than equations since the sys- 
tem had to fulfil the Walras’ law. Comparing with Cassel, Walras showed a correct 
way to equate the number of unknowns to that of equations by introducing the 
numeraire. In 1919, in his critical review of Cassel’s Theoretische Sozialokonomie, 
Wicksell [59] noted that Cassel proceeded in an entirely different manner. Cassel 
considered money had a function of a store of value, but Wicksell insisted that Cas- 
sel ’s “premature introduction of money had contributed not to increased lucidity 
but rather the reverse (p.225)”. In his 1919 paper, we read; 

he (Cassel) describes how the total rewards of the factors of production are 
in the main identical with total (real) incomes, and are at the same time the 
source of the demand for goods and services; he adds that these incomes 
are not all consumed, but are partly saved. But at this point the equality 
between the sum of the factors of production now available, and that part 
of them which enters into the various goods demanded for consumption, 
ceases to obtain, and Professor Cassel’s equations (in this paper (1)) is no 
longer valid, (pp. 225-226) 
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Thus we can safely say that Cassel’s system was known to the Viennese 
economists by the mid- 1920s. 

In the thirties, some authors, such as Neisser, von Stackelberg and Zeuthen, 
pointed out that the equality of the number of equations and that of unknowns 
did not assure an economically meaningful solution. Neisser ([36], p.424) gave 
a numerical example in which equilibrium prices are negative. Von Stackelberg 
([51], pp.463-467) noted that if the number of factors exceeded that of goods, 
there were more equations than unknowns. Zeuthen [61] argued if the demand 
for any factor fell short of its supply, then the factor could no longer be regarded 
as a scarce good but should be regarded as a free good. Then he insisted that 
the equation (1) should be modified to the inequality and that he introduced the 
idea of complementary slackness conditions. 

Karl Schlesinger, independently of Zeuthen, made the same point^. Karl 
Menger invited him to present the paper in Menger’s colloquium. 

In this way, the Walras-Cassel system was widely discussed in Vienna. Karl 
Menger introduced this theory into his colloquium. Menger seemed to regard 
the Walras-Cassel general equilibrium theory as a suitable material of the dis- 
cussion in his colloquium. 

However, the foundation of general equilibrium theory was criticized by the 
Central European economists. Mayer, von Mises, A. Amonn and E. Schams 
seemed to be the representative figures. They criticized general equilibrium 
theory since it could not explain the principles of price formation in real mar- 
kets. 

Mises ([48], p.36) claims that “the Austrian School endeavors to explain 
prices that are really paid in the market, and not just prices that would be 
paid under certain, never realizable conditions”. The Austrian school rejects 
a mere description of a state of hypothetical static equilibrium. They insist 
that the economists should describe and explain the entire genetic path of the 
economic process in its various intermediate stages leading to the attainment of 
the equilibrium. The origin of this Austrian attitude goes back to Carl Menger’s 
methodology of economics. 

Amonn [3], Schams [42] and Mayer [27] criticized Cassel’s economics as 
a whole on the same grounds. Cassel began directly with a demand function. 
They criticized that Cassel’s theory was devoid of the explanation of price for- 
mation ([3] p.41, [42] pp.389-390, [27] pp.234-235). 

Karl Menger, the son of Carl Menger, was fully aware of these criticisms. 
Nevertheless, the members around Menger adopted the Cassel equation system 



® Schlesinger read the papers by Neisser and von Stackelberg. Thanks to their pa- 
pers, Schlesinger obtained the idea of modifying the equation (1) to the inequality. 
A footnote indicated that Schlesinger became aware of Zeuthen’s paper while his 
paper was in proof. See K. Schlesinger [44]. 
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as a material of the discussion in the colloquium. We explore the reason why 
they accepted the equilibrium theory. 

Karl Menger ([31], p.54) explains the methodological difference between 
the Austrian economists and the mathematical economists; the former look for 
causal explanations^ of economic phenomena, whereas the latter wish to con- 
fine themselves to the study of functional relations (K. Menger ([31], p.54). 
Moreover, the Austrians are looking for the ultimate cause of economic phe- 
nomena. 

On the contrary, the mathematical economists refuse the causal explana- 
tions of economic phenomena since we are easily led to the metaphysical or 
pseudo-problems by such a research program. Menger ([31], pp.54-55) goes 
on to say; “the Austrians are looking for the essence (das Wesen) of economic 
phenomena, thereby moving on dangerous ground surrounded by swamps of 
pseudo-problems”. The members of the colloquium consciously avoided the 
danger of stepping into the area of metaphysics or pseudo-problems. For ex- 
ample, in his review of Otto Kiihne’s Exakte Nationalokonomie, Franz Alt crit- 
icizes that Kiihne adopts the causal explanations of economic phenomena. Alt 
[2] claims that the causal explanations will be a source of metaphysical argu- 
ments beyond our observations. Alt insists that economics should be restricted 
to the research of economic phenomena which can be expressed in terms of 
observable languages. 

According to Karl Menger ([31], p.47), one of the reasons why he adopted 
Cassel’s system as a material of the discussion in his colloquium was that all 
objects in the system were clearly measurable or observable^. The members of 
the colloquium rejected the causal explanation of economic phenomena. They 
intended to exclude the metaphysical problem from an economic theory. 

In this respect, the Walras-Cassel system seemed to be a suitable material. 
It was free from such a metaphysical problem since it dealt with the functional 
explanation among the economic quantities. 



^ Here we make clear the meaning of causal explanation. What do we mean when we 
say, “A is the cause of 5”? In the opinion of philosophers of science, causal relation 
means predictability. This does not mean actual predictability, rather a potential 
predictability. It means predictability in the sense that, if all the relevant facts that 
surrounded A were known, together with all the relevant laws, the occurrence of 
B could then be predicted. In other words, we say that an event B is caused by a 
preceding event A if and only if 5 is deducible from A with the aid of the relevant 
laws. See R. Carnap [9], chap. 19. 

^ Cassel claims that the concept “utility” should be excluded from economic theory. 
He considers that the concept “utility” is nonobservable term in the sense that it can 
never be measured by any simple and direct procedure. We do not have a clear idea 
whether the members of the colloquium admitted or rejected the concept of utility. 
In 1936, Alt published the paper on the measurability of utility [1]. Presumably, 
there were different attitudes among the members of the colloquium toward Cassel’ s 
suggestion. 
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In this way, the members of the colloquium adopted the Walras-Cassel sys- 
tem. 

They were exclusively concerned with the existence problem of the Walras- 
Cassel system. In this respect, they were strongly influenced by the philosophy 
advocated by Hilbert. We discuss this point in detail in the subsequent section. 



3. Hilbert’s philosophy on foundations of mathematics 

Wald’s primary concern was the existence proof of the Walras-Cassel system. 
In our view, Wald was strongly influenced by the philosophy advocated by 
Hilbert. We first define the content of Hilbert’s formalism. 

Hilbert required the axioms to satisfy certain logical requirements, i.e., 
completeness, independence and consistency. Among them, the check of the 
last requirement is the most difficult task. Axioms are said to be consistent 
when it is not contradictory, i.e., when it is not the case that both a formula and 
its negation are provable in the system. 

Moreover, Hilbert insists that the mathematical existence of the concept is 
established if one can prove that the attributes assigned to its concept never 
lead to a contradiction. Hilbert ([19], p.300) says; 

If contradictory attributes be assigned to a concept, I say that math- 
ematically the concept does not exist. 

According to Hilbert, the proof of the consistency is equivalent to the ex- 
istence of the corresponding mathematical concept^ . We regard this insistence 
as the most essential feature of Hilbert’s formalism. 

Hilbert’s formalism contrasts sharply with the intuitionism developed by L. 
E. J. Brouwer. Menger intended to study with Brouwer, who was regarded as 
one of the leading figures in the field of topology at that time. In 1925, Menger 
went to Amsterdam, where he spent two years working with Brouwer. But, in 
the twenties, Brouwer’s major interest had shifted to the foundations on math- 
ematics. After his return to Vienna, Menger introduced Brouwer’s intuitionism 
into his colloquium and frequently discussed it with the members. 

Brouwer’s view on the foundations of mathematics consists of two claims. 



^ Historically speaking, we can find an axiomatic method in Euclid’s Elements in the 
third century B.C.. Hilbert’s view is much more penetrating and solid than Euclid’s. 
In this respect, Bourbaki ([7], p.39) said; 

whereas in traditional logic the non-contradiction of a concept only made 
it “possible”, it was equivalent for Hilbert (at least for mathematical con- 
cepts defined axiomatically) to the existence of the concept. 
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(1) Brouwer believes that mathematics is founded on intuition^^. 

(2) Brouwer rejects the law of excluded middle applied at least to infinite sets^^ . 
He accepts only proof that is constructive. Brouwer considers that a mathemat- 
ical object exists only when we can actually construct the object in a finite 
number of steps. 

If Brouwer’s view were to be adopted, a great part of mathematics would 
have to be given up^^. In his Reminiscences, Menger ([33], pp. 138-139) said 
that “his attacks on the law of the excluded middle and the consequences for 
mathematics of its rejection had been discussed in the Circle on several earlier 
occasions”, but Brouwer’s “obscure remarks on primordial intellectual phe- 
nomena and primordial mathematical intuition were not taken seriously by any 
member of the Circle”. 

Hilbert’s philosophy had a great influence on mathematicians’ world. Since 
it appeared, mathematicians recognized the significance of the existence proof. 
They considered that the consistency of a mathematical theory with which they 
were concerned should be proved prior to any mathematical reasoning. 

In the field of the foundations of mathematics, Hilbert intended to prove 
that all of mathematics contained no contradiction. 

In this connection, a young Viennese mathematician, Kurt Godel made 
a completely unexpected and most significant discovery. Roughly speaking, 
Godel’s incompleteness theorem states that for any formal theory there exists 
a proposition which can neither be proved nor disproved within the theory in 
question. Godel’s result shattered Hilbert’s hope for founding mathematics on 
proof of its consistency^^. 

The members of the colloquium were quite interested in the foundations of 
mathematics and encouraged the research on the subject. Hans Hahn was an 
advisor of Godel’s dissertation thesis on completeness theorem. It was at the 



According to Hahn, “its point of departure seems too much akin to Kant’s pure 
intuitionism and Kant’s a priorr. H. Hahn, [17] p.26. 

Scarf ([41], p.l2) insisted that “Brouwer’s original demonstration in 1910 was not 
concerned with effective computational procedures, and Brouwer himself eventu- 
ally rejected the theorem because of its “non constructive” aspects”. Scarf’s insis- 
tence seems to be based upon the fact that Brouwer published the paper [8] concern- 
ing to the fixed point theorem, which was modified so as to meet the intuitionistic 
standards. 

For example, Brouwer does not accept the general concept of irrational number 
since we can not construct its object in a finite number of steps. On the contrary, 
Hilbert insists that in order to prove the existence of irrational number, we do not 
actually have to construct it and we do not even have to show how it can be con- 
structed. All we have to do is to prove that it must exist because any other conclusion 
results in a contradiction. 

It is well-known that von Neumann pursued his interest in problems concerning the 
foundations of mathematics. For instance, his paper [49] is devoted to the problem 
of consistency of mathematics. 




176 I. Mutoh 



mathematical colloquium that Godel presented his result on the incompleteness 
theorem [15]. At the time Menger was on a lecture tour in the United States. 
Menger ([33], p.171) said; “I learned of his epoch making logical discovery 
from a letter of Nobeling while I was lecturing in Houston”. And he ([33], 
p.202-203) wrote; “In my excitement about this news I interrupted my course 
with a report about Godel’s epoch-making discovery”. 

Although he published no paper on the foundations of mathematics, Wald 
seemed to be decisively influenced by Hilbert’s philosophy. 

Then we raise here the question; how did Hilbert’s thought exert an influ- 
ence on Wald’s research? In order to give an answer to this question, we restrict 
here ourselves to the field of geometry, with which Wald’s early research was 
connected. 

In his Grundlagen der Geometrie {The Foundation of Geometry), published 
in 1899, Hilbert established a complete set of independent axioms by means of 
which it would be possible to prove all theorems of Euclidean geometry. And 
by applying his axiomatic method, Hilbert showed that any contradiction in 
Euclidean geometry must appear as a contradiction in the arithmetic. 

Wald’s early research on geometry was concerned with the axiomatic sys- 
tem of Hilbert’s Grundlagen, Wald [52] improved Hilbert’s system by omitting 
some axioms and weakening others. 

The Ergebnisse contains no paper on the axiomatic system of Hilbert’s 
Grundlagen except for Wald’s paper. It should be noted that it was not in 
Vienna but in Gottingen that Hilbert’s Grundlagen was intensively studied. 
Hilbert became a professor at the University of Gottingen in 1895. At that time 
Gottingen was one of the centers of mathematical research activities. Among 
many young mathematicians attracted to Gottingen from all over the world by 
Hilbert’s Grundlagen, we can count the big names as F. Schur and A. Rosen- 
thal. They studied Hilbert’s book and some of their results were devoted to 
the improvements of its axiomatic system. A great part of these papers on the 
foundations of geometry appeared in the Mathematische Annalen. Hilbert was 
one of the principal editors of the journal. We recall here that the movement 
toward an axiomatization in economics never occurred in Gottingen, Hilbert’s 
homeground. 

On the other hand, the Viennese mathematicians and philosophers res- 
onated with the philosophical aspect of Hilbert’s Grundlagen. Indeed, the Vi- 
enna Circle paid a special attention to Hilbert’s Grundlagen in order to support 
their philosophical views. And this fact seemed to play an important role in the 
emergence of highly axiomatic mathematization of economics in Vienna. We 
examine this point in detail. 
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4. The new trend in the philosophy of science 

The Vienna Circle was formed around the physicist and philosopher, Moritz 
Schlick, All members had a first-hand acquaintance with some field of sci- 
ence, either mathematics, physics or social science. Also they were familiar 
with logic and keenly interested in foundational questions. The central figures 
among them were Rudorf Carnap, Otto Neurath, Hahn and many others. They 
met weekly since 1924. It was in the fall of 1927 that Karl Monger began to 
take part in the Circle on a regular basis. 

Their essential views might be classified under the following aspects 
(a) They sharply reject the existence of the synthetic a priori propositions in 
the sense of Immanuel Kant^^. 



Their views on the role of philosophy were consonant with Wittgenstein’s 
views, which were manifested by the closing statements in his Tractatus Logico- 
Philosophicus. The members of the Vienna Circle studied the Tractatus at Schlick’s 
and Hahn’s request. In the Tractatus, we read; 

The right method of philosophy would be this. To say nothing except 
what can be said, i.e. the propositions of natural science, i.e. something 
that has nothing to do with philosophy: and then always, when someone 
else wished to say something metaphysical, to demonstrate to him that 
he had given no meaning to certain signs in his propositions. This method 
would be unsatisfying to the other — he would not have the feeling that we 
were teaching him philosophy — but it would be the only strictly correct 

method. Whereof one cannot speak, thereof one must be silent. 

([60], pp. 187- 189). 

In Karl Menger’s opinion ([33], p.84), “this last sentence of the Tractatus was of 
great importance for philosophy”. But the Vienna Circle’s philosophy did not di- 
rectly originate in Wittgenstein’s Tractatus. Karl Menger ([33], p.90) wrote; 

Schlick emphasized this role of philosophy in his lectures even before he 
had seen the Tractatus''. This concept was probably also dormant in the 
minds of some of the other members of the Circle”. 

For detailed accounts of the relation between Wittgenstein and the members of the 
Vienna Circle, see Menger [33]. 

In all subject-predicate judgements, Kant ([22], p.l30) argued that the relation of a 
subject A to the predicate B could be considered in two different ways. The relation 
is either the predicate B belongs to the subject A as something that is contained in 
the concept A, or B lies entirely outside the concept A. In the first case Kant calls 
the judgement analytic, in the second case he calls the judgement synthetic. In this 
sense, an analytic statement involves nothing more than the meaning relations of 
the term, while a synthetic statement is an assertion that goes beyond the assigned 
meanings of the term. A priori statements are completely independent of experi- 
ence. It is never necessary to refer to experience as a justification for the truth of an 
analytic statement. A posteriori statements are essentially dependent on experience 
in the sense that it has to be justified by experience. It is obvious that all analytic 
statements are a priori. Now an important question arises; does the demarcation 
line between a priori and a posteriori coincide with the one between analytic and 
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(b) They consider that the task of philosophy is to investigate the syntactical 
structure of propositions in sciences. 

(c) The peculiar view they put forward is that all sciences must be pursued 
according to the same method. In this sense, there can exist only one science 
for them. They were actually fighting for unified science. 

They stressed the need for rigorous transformation of propositions through 
logico-deductive inference and critically examined the statements in sciences 
from a formal point of view. In order to approach this target, the axiomatic 
method seems to be indispensable. In an axiomatic method, all the assumptions 
required for the given theory have to be enumerated explicitly and completely. 
In addition, the conclusions have to be drawn by the logico-mathematical rea- 
soning. 

Furthermore, according to the program for unified science, there is no fun- 
damental difference in axiomatic method in all sciences. 

The Vienna Circle takes a formalistic approach, which is ascribed to 
Hilbert, to an axiomatic system. Carnap says; “Today, we often take a purely 
formalistic approach to an axiom system. We do not ask what interpretations 
or applications it may have, but only whether the system of axioms is logically 
consistent and whether a certain statement is derivable from it”^^. 

The Viennese mathematicians and economists were in favor of the Vienna 
Circle’s philosophy of science. In their acceptance of it, they each put a special 
emphasis on one of aspects according to each interest. 

Morgenstem ([35], p.396) stressed the need to treat any theories in a rig- 
orous way and was in favor of the axiomatic method. He said that “in order to 
gain a rigorous insight into the state of any science, the use of the axiomatic 
method cannot be dispensed with”. And he continued to say, “there is no fun- 
damental difference in the axiomatic procedure whether it be the formulation 
of empirical or aprioristic sciences ([35], p.396).” For economics, “this science 
has to be seen in its empirical character and has to be built as such, but that it 
has to be developed as rigorous theory ([35], p.396)”. It is obvious that Mor- 
genstem was strongly influenced by the view of unified science promoted by 
the Vienna Circle. 

Morgenstem grasped the content of axiomatic method as rigorous logico- 
deductive system. Although he ([35], pp.409-410) appreciated the Wald’s exis- 



synthetic? In other words, is it possible for knowledge to be both synthetic and a 
priori? Kant answered to this question affirmatively. Kant ([22], chap.l) regarded 
all geometrical statements as synthetic a priori propositions. In the Circle’s view, 
Kant should be condemned by his confusion between pure geometry and physical 
geometry. The former is simply a deductive system based on the set of axioms. It 
is completely independent of the world. The latter, on the other hand, is concerned 
with the application of pure geometry to the world. This distinction was made es- 
pecially clear through Hilbert’s Grundlagen. 

R. Camap. [9] p.l30. 
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tence proofs, Morgenstem were rather concerned with the application of eco- 
nomic theory to the real world and the empirical justification for economic 
theory. 

On the other hand, as a mathematician, Menger thought he should be con- 
cerned with the rigorous analysis of theories. His methodological views are 
stated in his paper [28] on the law of returns. Although he admits that “the 
crucial issue for economics to be whether or not these laws are empirically 
confirmable”(p.280), Menger regards this empirical question as a secondary 
issue^^. 

Both Wald and Alt shared Menger’s methodological views in common (see 
A. Wald [55] p.369., see also F. Alt [2]). 

Wald was not a member of the Vienna Circle. However, Wald seemed 
to become acquainted with the Vienna Circle’s view through his reading of 
Menger’s papers or personal communication with Menger. 

Wald axiomatized economic theory. The Vienna Circle’s philosophical 
view had an influence on Wald’s axiomatic treatment of economic theory. 
For him, axiomatizing the theory was fundamental prerequisite for the exis- 
tence proof. Hilbert’s formalism had an influence on Wald’s concern for the 
existence proof. As a mathematician, Wald studied Hilbert’s Grundlagen and 
its axiomatic method from purely mathematical interest. In consequence, the 
emergence of an axiomatization in economics due to Wald is explained by 
his acceptance of Hilbert’s formalism in connection with the Vienna Circle’s 
philosophical views. 



5. The development of convex analysis 

We recall here that it was at the mathematical colloquium that Wald presented 
his proof and its main participants were mathematicians. If Wald’s proof had 
not contained mathematical contents attracting the Viennese mathematicians’ 
interest, Wald would never have been induced to present his proof in the collo- 
quium. 

In Vienna, the mathematical tools of convex analysis were intensively stud- 
ied. For example, Eduard Helly at the University of Vienna established a theo- 
rem which ensures the existence of an intersection of a family of convex sets. 
It greatly contributed to a linear inequality theory. Hahn established what is 
now called the Hahn-Banach theorem. From a geometric viewpoint, it asserts 
that there exists a closed hyperplane that separates two disjoint convex sets in 
infinite dimensional space. Moreover, the fixed point theorem was intensively 

Karl Menger had a skeptical view as to the possibility of unified science. He ([33], 
p.l76) said, “I feared that the idea of a unified science might possibly lead to the 
exclusion a priori of potentially valuable objects or methods of study.” 
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studied by Borsuk, Knaster and Mazurkiewicz. They were frequently invited 
to the colloquium. 

In the case of von Neumann’s paper^^, we can easily recognize that his 
proof closely linked up with the development of convex analysis. In fact, he 
formulated a balanced growth model in terms of inequalities and his existence 
proof was based upon a fixed point theorem. 

On the contrary, at first glance we have an impression that Wald’s proof, 
being based upon the mathematical induction, had nothing to do with the de- 
velopment of convex analysis of that time. However, it should be noted that it 
was Karl Menger who encouraged him to present his proof at the mathematical 
colloquium. It may be conjectured that Wald’s idea of proof also linked up with 
the development of convex analysis. 

Then we ask the question; to what extent did Wald’s existence proof rest 
on the properties of convexity? In order to give an answer to this question, we 
examine Wald’s proof and evaluate its mathematical structure. 

As we have already mentioned above, it was Karl Schlesinger who started 
the discussion. The model which Schlesinger reformulated may be summarized 
as follows. 

(4) the demand-supply conditions for factors: 

As ^ r. (4) 

(5) the price-cost conditions for goods: 

^Aq = p. (5) 

(6) the demand functions: 

Sj = fj{Pj) (j = ,n). (6) 

(7) the complementary slackness conditions: 

^q{r - = 0. (7) 

Schlesinger also raised the conjecture that this procedure would solve the 
existence problem. 

Wald inverted the demand function (6) to the form 

Pj = U = h‘2,--- ,n). (8) 

This paper was presented for the first time in 1932 at the mathematical seminar of 
Princeton University. Karl Menger invited him to present the paper at the mathe- 
matical colloquium. 
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Wald proved the theorem that the system of equations (4)(5)(7) and (8) has 
a solution under the following conditions (see Appendix (I)). 

(Al) A ^ 0, and for each j there is at least one i for which aij ^ 0. 

(A2) r > 0. 

(A3) For each j, the function /j(sj) is defined for every positive value of Sj, 
its value is nonnegative, continuous , and strictly monotone decreasing, and in 
addition lim /.(sj) = +oo. 

Sj— >0 

The idea of Wald’s proof was based on the mathematical induction by the 
number of goods. 

It can be readily seen that the system of equations (4)(5)(7) and (8) has a 
solution for the case n = 1. It now suffices to prove the theorem for the case n 
on the inductive hypothesis that it holds for the case n — 1. 

We define A = max{A > 0 | — a^^A ^0, z = 1, 2, • • • , m}. For each 

A G (0, A), consider the system, which we denote by (W^_i), 

-j- ' ’ ■ CLi,n—l^n—l = f 7 2, • * m) , 

Pj — “t“ * ‘ * 4“ OjmjQm (j — ’ fTi 1), 

4 h ai^n-iSn-i - r[) = 0. 

For A G (0, A), the system (W^_i) satisfies assumptions (Al) to (A3). 
Then the system (W^_^) has solutions s* > 0, p* ^ 0 (j = 1, • • • , n - 1) 
and g* ^ 0 (z = 1, • , m) by the inductive hypothesis. We choose 5^ = A G 

(0, A)^^. The solutions for the case n — 1 and = A satisfy all the equations 
except for those that contain the unknown that is to say. 



(Wti) 



Pn = /n(A) and Pn = ainqi 4 h dmnqm- 

Then we construct the set. 



T(A) = {ainqi+- ■ ■+amnqm-fnW I {qi,--- ,9m) are solutions of (wGi)}- 

The essential argument of Wald’s proof is to show that there exists Aq G 
(0, A) such that 0 G T(Ao). 

To obtain this result, Wald applies the intermediate value theorem to the 
correspondence T : (0, A) ^ R. In order to appeal to this theorem, we have to 
check a couple of points; that is to say, 

(i) T is a convex- valued and bounded- valued correspondence that 
has a closed graph on any closed subinterval / of (0, A), and 

We are much indebted to Hildenbrand [20] for an important insight into the mathe- 
matical ideas embodied in Wald’s proof. 
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(ii) there are A' and A" in I such that T(A') contains a non-positive 
number and T(A") contains a nonnegative number. 

The remaining part of Wald’s proof ([53], pp.15-16) is devoted to showing the 
fulfillment of all the conditions of the intermediate value theorem. Thus, by 
virtue of this theorem, there exists Aq G (0,A) such that 0 G T(Ao). Then 

— Ao together with the solution of constitute the solution of the 

equation system (4)(5)(7) and (8). 

We make here a few remarks on the characteristics of Wald’s original 
method of proof. 

(I) Wald assumed that supply of the factor is exogenously given and does not 
depend on the factor price. This assumption enabled Wald to solve the existence 
problem in an elementary way. The first step is to solve the system of inequality 

(4) , which can be reduced to the problem that Julius Farkas proposed. Assume 
now that the desired solution 5* actually exists. Then, one can obtain from (8) 
the solution p* corresponding to this 5*. Finally, we determine the value g*, 
which must be chosen so as to satisfy both conditions (5) and (7). 

The model developed by Schlesinger and Wald is based on the idea that 
the system of equations should be modified to that of inequalities. Schlesinger 
introduced nonnegative slack variables on the demand side of the factor market 
equations. The condition (7) implies that if the condition (4) holds with strict 
inequality for some factor i, then the corresponding price qi must be zero. 

(II) As mentioned above, Wald’s idea of proof was based on the mathematical 
induction by the number of goods. His proof was not based on the fixed point 
theorem. Indeed, the proof of the existence for modified Walras-Cassel sys- 
tem could be achieved in much more elementary ways without the fixed point 
theorem. 

If we assume that the supply of factor does depend on the factor prices, 
the fixed point theorem must be needed for proving existence. Indeed let us 
assume the supply of factor is a function of its factor price qi. We start 
with a vector q' = (^i, ^ 2 ^ • ^ 7m)’ which is chosen arbitrarily. We solve the 

system of inequality (4) for the vector q' and we obtain the solution s'. From 
(8) the vector p' = • • iPn) obtained for the solution s'. Finally, we 

determine the value q", which must be chosen so as to satisfy both condition 

(5) and (7). However, there is no guarantee that the vector q" coincides with 
the initial vector q'. 

Arrow [4] introduces a Wald’s episode in connection with the use of fixed 
point theorem. In the Wald’s paper, which was announced to appear in the 
Ergebnisse in 1938, he formulated a general equilibrium theory in a pure ex- 
change economy on the basis of Paretian analysis. Wald ([55], pp.379-380) 
referred repeatedly to the need for “method of modem mathematics” for the 
proof. Unfortunately, Wald’s paper was lost. Taking Uzawa [47] ’s result into 
account, the existence of equilibrium is mathematically equivalent to the fixed 
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point theorem. Then, as Arrow ([4], p.26) pointed out, “it seems impossible 
to conceive of any proof not based on a fixed point theorem”. Thus Arrow 
([4], p.26) conjectures that “the references to “modem methods” may imply 
Wald’s use of fixed point arguments”. Arrow ([4], p.27) goes on to say; “But 
we will never know, unless, against all probability, Wald’s original manuscript 
emerges”. 

It is often said that Wald’s proof is “clumsy” (for example, see Wolfowitz, 
([56], p.3)). Thus, for example, Kuhn [24] reorganized Wald’s proof by using 
the duality theorem in linear programming and the Kakutani fixed point the- 
orem. In this paper, we do not take such an approach. Rather, we are going 
to examine to what extent mathematical theorems can be proved based upon 
Wald’s idea of proof. For example, we can prove the Minkowski-Farkas lemma 
with the additional conditions (Al) and (A2) (see Appendix (II)). 

Moreover, as is well known, the duality theorem in linear programming can 
be proved by using the Minkowski-Farkas lemma. We assume the conditions 
(Al) and (A2’) r ^ 0 and p ^ 0. We replace the equality of the price-cost 
conditions for goods (5) by the inequality 

^Aq ^ p (9) 

and introduce the complementary slackness conditions 

^sCAq-p)=0. (10) 

It can be verified that Wald’s idea is applicable to the proof of the duality the- 
orem in linear programming with these additional conditions (see Appendix 
(III)). In addition, we consider the two-person zero-sum matrix game and the 
minimax theorem. Wald’s idea is applicable to the proof of the minimax theo- 
rem (see Appendix (IV)). 

Thanks to these mathematical exercises, we can evaluate the “depth” of 
Wald’s proof. The result of these exercises shows that Wald’s proof con- 
tained mathematical contents attracting Viennese mathematicians’ interest. 
Then Menger seemed to consider that Wald’s paper was well-qualified as a 
material of the discussion in a seminar of mathematicians. 



6. Concluding remarks 

As we have mentioned above, Wald’s existence proof should be regarded as 
a starting point of the axiomatization in economics. In this paper, we have in- 
vestigated the intellectual milieu which stimulated Wald to the existence proof; 
(a) the research activities in economics, (b) Hilbert’s philosophy on the foun- 
dations of mathematics, (c) the new trend in the philosophy of science, and (d) 
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the development of convex analysis. It was the interplay of these factors that 
stimulated Wald to the existence proof. Karl Menger played an important role 
as an organizer. He built the bridge between these factors. 

We are now in a position to give a brief summary of this paper in compari- 
son with the preceding literature on this topic. 

In section 2, we explored the reason why the members of Monger’s collo- 
quium accepted the Walras-Cassel general equilibrium theory as a material of 
the discussion in the seminar. Weintraub [57] omitted the question which we 
posed in this paper. According to Weintraub, Cassel’s book was widely used 
as a text in Central European countries. However, we must pay attention to the 
fact that the general equilibrium theory was severely attacked from the Austrian 
economists such as von Mises and Mayer. We investigated the methodological 
views shared by almost all members in Menger’s colloquium. They rejected 
the causal explanation of economic phenomena and looked for the study of 
functional relations. In addition, Cassel’s equation system was suitable for a 
material of the discussion since all objects in the system were observable. 

Wald axiomatized the Walras-Cassel system and solved the existence prob- 
lem. 

In our views, Wald’s concern for the existence proof reflected Hilbert’s phi- 
losophy on the foundations of mathematics. And his concern for the axiomatic 
treatment in economics reflected the Vienna Circle’s philosophy. We have to 
distinguish the influence of Hilbert’s philosophy from that of the Vienna Cir- 
cle’s philosophy. 

Then, in section 3, we defined the content of Hilbert’s formalism as follow- 
ing; the consistency of the axiomatic system is equivalent to the existence of 
the mathematical concept. 

In contrast, Ingrao & Israel ([21]) and Punzo ([37], [38], [39]) seemed to 
regard the essential content of Hilbert’s formalism as purely logico-deductive 
systems (see L. Punzo. [39] p.23). 

For example, Ingrao & Israel ([21], pp. 182-183) said that Hilbert’s '‘Grund- 
lagen der Geometrie represented the programmatic manifesto of the axiomatic 
movement and explicitly brings out the main points of this tendency, according 
to which a mathematical theory is nothing more than a complex of theorems 
obtained through deductive logic and defining the properties of a mathematical 
entity defined by axioms”. 

Ingrao & Israel ([21], pp. 182-183) seemed to regard that the influence of 
Hilbert’s philosophy was reflected only in Wald’s attitude toward axiomatizing 
economics. This is due to their ambiguous grasp of the content of Hilbert’s 
formalism. 

Hilbert’s philosophy was mainly concerned with mathematics proper. On 
the other hand, it was in economics that Wald achieved the existence proof. 
On this point, Ingrao & Israel ([21], p. 187- 188) pointed out that Wald was 
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strongly influenced by the ideas of the Vienna Circle. They ([21], p.l88) said 
that Vienna was “the center of renewed interest in the formalization of the 
social sciences”. Ingrao & Israel ([21], p.l89) regarded the fundamental theme 
of the Vienna Circle’s philosophy as the “unification of the sciences”. 

Following the Vienna Circle’s view, all sciences must be pursued according 
to the same method. In order to approach this target, the axiomatic method 
seems to be indispensable. 

In our views, the emergence of the existence proof in economics due to 
Wald can be explained by his acceptance of Hilbert’s formalism in connection 
with the Vienna Circle’s philosophy of science. 

Finally, we have evaluated the mathematical depth of Wald’s existence 
proof. This aspect has never been considered in the preceding investigations 
in the recent literature. We have shown that the important theorems in the field 
of convex analysis such as the Minkowski-Farkas lemma, the duality theorem 
in linear programming and the minimax theorem can be proved based upon 
Wald’s idea of proof. 

Mathematical economics flourished in Vienna between the Wars, which 
seemed to forecast a brilliant future. However, it could not be kept for long. 
Hitler’s annexation of Austria ended that possibility. 

Politically, Austria was in a state of increasing tension and chaos. Nazi 
menace became urgent. Menger ([30], p.l9) said; 

The Ergebnisse was criticized (with specific reference to Wald) for 
its large number of Jewish contributions just when I felt that we 
ought to honor that journal by making Wald co-editor. Issue 7 was 
edited by Godel, Wald and myself. But Issue 8 containing Wald’s 
paper on collectives was destined to be the last of the series. • • • 
Viennese culture resembled a bed of delicate flowers to which its 
owner refused soil and light while a fiendish neighbor was waiting 
for a chance to ruin the entire garden. 

The activities in the Vienna Circle and Menger’ s colloquium came to an 
end. The members were dispersed. 

Menger left for the United States and was offered a professorship at the 
University of Notre Dame in Indiana. Carnap also emigrated to the United 
States in 1935, where he taught at the University of Chicago. In 1936, Schlick 
was assassinated by a student on the stairways of the University of Vienna. 
Karl Schlesinger committed a suicide exactly on the day of Hitler’s invasion to 
Austria. 

Wald was reluctant to leave Vienna. But Morgenstem persuaded Wald to 
leave Austria and got him to the United States. 




186 I. Mutoh 



7. Appendix 



In this Appendix, we evaluate the “depth” of Wald’s proof. As we have already said 
in section 5, we can prove the Minkowski-Farkas lemma, the duality theorem in linear 
programming and the minimax theorem based upon Wald’s idea of proof. 

At first, we briefly repeat Wald’s original method of proof for the convenience of 
our arguments. Next, we shall show that Wald’s idea of proof is applicable to the proof 
of the Minkowski-Farkas lemma, the duality theorem in linear programming and the 
minimax theorem. These mathematical exercises may help us to evaluate the “depth” of 
Wald’s idea. 

(I) Wald’s proof 

Let A be the m by n matrix with elements aij. And we denote by r an m- 
dimensional column vector with components Vi and by s an n-dimensional column 
vector with components sj. p and q represent the column vectors with components 
Pj (j = 1, 2, • • • , n) and qi {i = 1,2, - ‘ ,m) respectively. 

Wald considers the following system, which we denote by (W). 



(W){ 



As ^ r, 

"Aq = p, 

Pj = fjisj) 
*^q[As — r) = 0. 



(j = 1,2 - • • ,n), 



Wald makes the following assumptions. 

(Al) A ^ 0, and for each j there is at least one i for which aij ^ 0. 

(A2) r > 0. 

(A3) For each j, the function fj^Sj) is defined for every positive value of Sj, its 
value is nonnegative, continuous , and strictly monotone decreasing, and in addition 
lim /j(Sj) = +00. 

Sj —>•0 

Wald’s theorem If assumptions (Al) to (A3) are satisfied, then there exist solutions 
Sj > 0, Pj ^ 0 (j = 1,2, • ■ - ,n), q* ^ 0 (i = 1, 2, • • • , m) for the system (W). 



We state three lemmas before embarking on to the proof of Wald’s theorem. 

Lemma 1 Let Sj,p* {j = 1, 2, • • • , n), q* (i = 1, 2, • • • , m) be solutions and let 
Sj + As^-, Pj + Apj {j — 1,2, • ,n), q* A Aqi {i = 1,2, - • , m) be solutions for 
the system that arise out of (W) if the ri are replaced by Vi + An. Then the following 
condition holds; 



AqiAn A Aq 2 Ar 2 H h AqmArm ^ 0. 

Lemma 2 Let {rf , rj , • * • , (/c = 1, 2, • • • ) be a convergent sequence of m- 

tuples of positive values that converges toward the m-tuple ri,r 2 , • • • ,rm, and let 
, Pj (j = 2 , • • • , n), c/f {i = 1, 2 , • • • , m) be solutions for the system (W^) 

which are obtained by substituting rf for Vi in (W). Then the sequence {sj} is bounded 
and for every i the sequence {g'f } is also bounded. 

Lemma 3 If solutions of the system (W^) converge as k — ^ +oo to s*,p*j,q*, then 
these numbers constitute solutions to the system (W). 

We present a brief sketch of his proof. 
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Proof. For n = 1, the system obviously has a solution. Indeed, we define 

Si = max{a > 0 I ana ^ n, z = 1, 2, • • • , m} 

Then we obtain pi = fi{sl) ^ 0. We can choose q* {i = 1, 2, • • • , m) from the 

non-empty convex set 

I Pi = aiK^i H h amiqm and = 0 if n - ansl > 0}. 

We prove Wald’s theorem for the case n on the inductive hypothesis that it holds for 

n — 1. _ 

We define A = max{A > 0 | n — a^nA ^0, z = 1,2, • • • ,zrz}. For each 
A G (0, A), consider the system, which we denote by (W^_i), 

anSi H h a^,n-l5n-l ^ r^ - a^n\ = r • (z = 1,2, • • • ,m), 

t n-i)\ p^=aijqi-\ [-amjqm (j = 1, 2, • • • , n - 1), 

, < 7 ^(a^lSl -f \- ai^n-iSn-i - r[) = 0. 

For A G (0, A), the system (W^_i) satisfies assumptions (Al) to (A3). Thus the 
system (W^_i) has solutions s* > 0, p* ^ 0 (j = 1, 2, • • • , rz — 1) and q* ^ 0 
(z = 1, 2, • • • , m) by the inductive hypothesis. 

Now we define the set for all A G (0, A), 

n(A) = {ain^i H ^amnqm I (<?!,• •• , 9 m) are solutions of (w^_i)}. 

From lemma 1, it is verified that if 0 < Ai < A 2 < A, then no number in n(A 2 ) 
is less than some number in n(Ai). Indeed, let Ai and A 2 be two numbers such that 
0 < Al < A 2 < A. If (<?i,g 2 , • • • ,<7m) is any solution set of (W^^i) and (qi -f 
Agi, q '2 + Ag' 2 , • * • , 9m + A^rn) is any solution set of (W^i^), then lemma 1 shows 
that 

Q'ln(A2 Ai)A(5^1 • • • amn{^2 Ai)Aq'7t^ ~ 0. 

Then we obtain 

U-lnAgi + a2nA^2 + ' ' ‘ + dmn Aq 

m — Q- 

Wald ([53], p.l5) shows that the set 11(A) is bounded, closed and convex. So the set 
n(A) is closed interval, which we denote by [7 t(A), 7f(A)]. From this, it follows that the 
sets ri(Ai) and I1(A2) are either disjoint or they have only one point in common. 

Now let {A/c} {k = 1, 2, • • • ) be a sequence such that for every k, 0 < X < 
Afc < A and lim Xk = A. The sequence {Tr(Afc)} is bounded. Let /3 be a limit of 

fc — > 4-00 

this sequence. Then (3 ^ ^(A). From lemma 2, the sequence of solution sets for the 
system (W^^i) {k = 1, 2, • • • ) is bounded. From lemma 3, (3 belongs to the set 11(A). 
Since Xk > A, it must be identical with ^(A). Hence we have lim 7 t(Aa:) = ^(A). 

/e^+oo 

Similarly, from 0 < A^ < A < A and lim A^ = A, it follows that lim ^(Afc) = 

fc— > + oo fc— > + oo 

7t(A). Furthermore, it follows that when A — ^ A the sequence {7 t(A)} converges. 

For every A G (0, A), we set 
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Z(A) =7t(A) -/n(A), 
r(A) =7f(A) -/„(A), 

and define the closed interval T(A) = [z^t]. Let 0 < Ai < A 2 < A. Then it follows 
that r(Ai) < t(A 2 ) by assumption (A3). It therefore follows that T(Ai) and T(A 2 ) are 
disjoint for Ai 7 ^ A 2 . 

We distinguish two cases; one is the case in which lini 7t(A) > /n(A) holds, the 

_ A-^A 

Other is the case in which lirn_ 7 r(A) ^ /n(A) holds. We focus on the former case. We 

A->A 

omit the latter case because the proof is straightforward and that this does not matter for 
the purpose of our arguments. 

In the former case, it is obvious that lim_r(A) > 0 and lim r(A) = — 00 . Then, it 

_ A->A ^-^0 

follows that there exists a Ao G (0, A) such that 0 G T(Ao). 

Then Sn = Ao together with the solution of (W^^i) constitute the solution of (W). 
This completes the proof of Wald’s theorem. 

As we have already said in section 5, the essential argument of Wald’s proof is to 
show that there exists a Ao G (0, A) such that 0 G T(Ao). As Hildenbrand [20] says, 
Wald applies the intermediate value theorem for a correspondence in order to obtain 
this result. He presents the theorem without proof. We prove here the intermediate value 
theorem for a correspondence. 

Intermediate value theorem Let T be a convex-valued and bounded-valued corre- 
spondence from a closed interval I into R that has a closed graph. If there are x and x 
in I such that T{x) contains a non-positive number and T(x) contains a nonnegative 
number, then there exists x* e I such that 0 G T{x*). 

Proof. Let I — [x,x]. It suffices to prove the theorem for the case where 0 0 T{gf). 
Without loss of generality, we may assume that T{x) contains only negative numbers 
and that T{x) contains only positive numbers. 

Now we suppose that there exists no x* G / such that 0 G T{x*). 

By assumption, T is a convex-valued correspondence. Then the interval I can be 
decomposed into the union of some nonintersecting nonempty interval and /■*" of 
I. Here an interval I~ consists of all points x such that T{x) contains only negative 
numbers. Similarly, an interval consists of all points x such that T{x) contains only 
positive numbers. It is clear that T is upper hemi-continuous since T is a bounded- 
valued correspondence and has a closed graph. By the upper hemi-continuity of T, both 
I~ and are relatively open in I. Moreover, and I~ are disjoint, i.e. = 0, 

and I~ = I. This contradicts to the connectedness of I. 

(II) The Minkowski-Farkas lemma 

Based upon Wald’s idea of proof, we can prove the Minkowski-Farkas lemma with 
the additional conditions (AI) A ^ 0, and for each j there is at least one i for which 
Qij 7 ^ 0, and (A2) r > 0. 

The Minkowski-Farkas lemma asserts that exactly one of the following alternatives 
holds 

(11) As — r 

has a nonnegative solution, or the inequalities 

( 12 ) ^Aq ^ 0 , *qr < 0 
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have a solution. 

Let us consider the column vectors ai of A. The set of all nonnegative linear com- 
binations of the ai forms a convex cone Z{ai, 02 , • •• , an} spanned by the a^. The 
statement that the equality ( 11 ) has no nonnegative solution means that vector r does 
not lie in Z{ai , a 2 , • •• , an}. In this case, the lemma asserts that there exists a vector 
q which makes an obtuse angle with r and a nonobtuse angle with each of the vector 
ai. This implies that there exists a hyperplane that has the cone Z{ai , a 2 , • • • , an} on 
one side and the point r on the other. For this reason, the Minkowski-Farkas lemma is 
referred to as the theorem of the separating hyperplane. 

A convex set in question is build up from the convex hull of a finite number of 
halflines. Therefore we can construct such a hyperplane by finite steps. This fact enabled 
us to prove the theorem on mathematical induction. 

Proof. Assuming now that (1 1) has no nonnegative solution, we shall show that (12) 
has a solution. We proceed by induction on n, the number of columns of A. 

If n = 1 , then ( 1 1 ) becomes 

' aiisi = ri 
(I21S1 = T2 

< 

< ami — r* 77X . 

Since ai and r are linearly independent, we can choose g' 7 ^ 0 so as to satisfy (12). 
Now, assume the theorem holds forn = — 1 and let us prove it forn = N. 

If the solution q satisfies 

aiiv^i + a2Nq2 + • • • + amNqm ^ 0, 

then the assertion is proved. 

Conversely, suppose 

aiiv^i -f a2Nq2 + • • * + amNqm < 0. 



Let 



aj = {q,aN)aj - (aj,q)aN {I ^ j ^ N - 1), 
r = {q,aN)r - {q,r)aN- 



Then r ^ Z{ai, a 2 , • • • aiv-i}. 

For if not so, there exist nonnegative numbers Aj such that f 



N-l 

Then 






{q,aN)r - (q,r)aN 



N-l 

= E Aj[(9,aw)aj - [aj,q)aN] 

j=i 



N-l N-l 

= E (Aj(9,aiv>aj) + E 

j=i j=i 
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Hence we have 



r = 



N-l 



J2 ^3(^0 + 

j=i 






N-l 



j = l 



which means r G Z{ai,a 2 ,---aiv-i,aA/^}. This is a contradiction. 

By the inductive hypothesis there exists q such that {aj, q) ^ 0 {1 ^ j ^ N — 1) 
and {q, f) < 0. 

Then let q* = {q,aN)q — {q, CLN)q and note 



{q\aj) = {q,aN){q,aj) - (q,aN){q,aj) 



(q*,aN) = {q,aN){q,aN) - (g, a^)(g, aiv) 

= 0 , 



(g*,r) = (q,aN){q,r) - (g,aiv)(g,r) 

(g,f) < 0. 

Thus, q* satisfies (12) and the theorem is proved. 



(Ill) The duality theorem in linear programming 

Moreover, as is well known, the standard method to prove the duality theorem in 
linear programming is to appeal to the Minkowski-Farkas lemma. We replace the equal- 
ity of the price-cost conditions for goods ^Aq = p by the inequality ^Aq^p and p 
is assumed to be given. We introduce the complementary slackness conditions (16) 
^ s{^ Aq — p) = 0. It can be verified that Wald’s idea is applicable to the proof of the 
duality theorem in linear programming. 

We consider the following system. 

(13) As ^ r, 

(14) ^q[As — r) = 0, 

(15) *Aq ^ p, 

(16) *s{*Aq-p)=0. 

We assume (Al) A ^ 0, and for each j there is at least one i for which ^ 0, 
(A2’) r ^ 0 and p ^ 0. We prove that the above system has nonnegative solutions s 
and q. 



Proof. We proceed by induction on n. If n = 1, then (13) becomes 

f aiisi^ri 
021 Si ^ T 2 
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By assumption (Al), for each j there is at least one i such that aij 0. For each 
such i we form Vijan and denote the minimum of these values by a. Then a is a 
solution of (13). 

For the case s\ = a = 0, it is obvious that we can choose ^ in a manner satisfying 
conditions (14)(15) and (16). 

We consider the case 5i = a > 0. For si = a > 0, if (13) anSi ^ rz {i = 
1, 2, • •• , m) hold with strict inequality for some i, then the corresponding qi is zero. 
Then, without loss of generality, we may assume (reordering if necessary) 

{ qk , ), qk # 0. 

k (m-fc) 

In view of (15) and (16), we can choose qk so as to satisfy the condition angi +a 2 ig 2 + 

h akiqk = Pi. 

Now we define 



A = max{A ^ 0 | n — a^ivA ^0, z = 1, 2, • • • , m} 

and r[ = Ti — aiN^. From A G [0, A], it follows that r- ^ 0. 

Consider the system forn = W — 1. 



( aiiSi -j- ai2S2~\-- ' ' -\-cli,n-iSn-i 
I CL 21 S 1 + <l22S2 + ‘ • • +U'2,7V— — 1 



^ ri - aiivA = r'l 
^V2 - a2ivA = r '2 



(17) < 



CLm2S2 CLm,N —iS N — 1 ^ m CLmN ^ m 

A G [0, A], 



N-l N-1 

(18) qi{ E - K) + • • • + qm{ E (^rnjSj ~ r'^) = 0, 

j-l j=l 



{ aiK7i + a2i(?2+ • • • ^ Pi 
ai2<?l + a22<72+ • • • -\-am2qm ^ P2 

: : : 

ai,A^-i^i + a 2 ,N~iq 2 + • • • + am,N-iqm ^ Piv-i, 



m m 

(20) - Pi) f SN-i{^ai,N-iqi ~ Pn-i) = 0. 

1=1 i=l 

We assume that the system forn = W — 1 has nonnegative solutions 5i , • • ^sn-i 
and gi , • •• , gm. And we now prove the theorem for n = N. 

We distinguish two cases. 

[Casel] There exists q in the solution set for n = — J. such that aiNqi + • • • + 

dmNqm ^ pn - In this case, we can choose Siv = A G [0, A] so as to suit our purposes. 
Then q together with si , • • • , sat-i , saa = A constitute the solution of the system. 
[Case2] For any q in the solution set, aiiv^i+ • •+amNqm < Pn holds. By assumption 
(Al), there exists at least one i which attains A. We choose one of such i and denote it 
by i*. 
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Let f(t) — aiNQi + • • • + CLi* N^Qi* + ^) + • • • + O'mNQrm = 0)- Then, /(O) < 
Pn, /(+cxd) = +00 and / is continuous. By the intermediate value theorem, there 
exists t* such that f{t*) = pn- Consider q' = {qir ’ ' ^q%* + t*, * * * ? 9m). For such a 

vector^', the condition (18) holds, since r'=. = Oandai-^iSiH \-ai* ^n-isn-i =0. 

It is obvious that q satisfies the condition (19). The equation (20) also holds. Here we 
note when ai*j = 0, Sj ^ 0, and when ai*j 0, sj = 0. The left-hand-side of (20) is 

Si [uil9l + • • • + CLi*l{qi* t ) + •••+ CLmiq-m ~ Pi] 

+ + SN-l[ai,N-iqi + • • • 4- CLi* ,N-l{qi* t ) + •••+ Clm,N-iqm — Pn-i] 

= 0 . 

Then q' together with si,-,57v-i,57v = A constitute the solution of the system. We 
complete the proof. 



(IV) The minimax theorem 

In its mathematical structure, the minimax theorem has turned out to be similar to 
the duality theorem in linear programming. 

We consider the two-person zero-sum matrix game and the minimax theorem. 
Wald’s idea is applicable to the proof of the minimax theorem. We interpret ^ ^ 0 
as a (m X n) payoff matrix and assume (A\) A ^ 0, and for each j there is at least one 
i for which aij 0. In order to prove the minimax theorem, it suffices to show that the 
expected payoff has a saddle point. 

Let Im (1, 1, • • • , 1) G In =* (1, 1, • • • , 1) G and let us consider the 
following system. 



As ^ Im, 



^q(As - Im) = 0 , 

'Aq^ln, 
^sCAq-ln) = 0. 



We can prove that this system has nonnegative solutions s* and q* in exactly the 
same way as the above theorem. For such s* and q*, ^s*ln =^q*lm holds. We set 

t;=Vln =Vlm. 

We define 



S* ^ Si — {(Si, 52, • • • , 5n) G I ^Sj — 1, Sj ^ 0}, 

m 

q* e 82 = {{qi,q2, ■ ■ ■ ,qm) € R*" | = 1 , ^ 0 }. 

i=l 



As* ^ ^ il„. 



Thus, 
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For any 5 G 5i and q G S 2 , 



^qAs* ^^q-lm 

V V 

^sAq- 

V V 



Thus, since ^q* As* we obtain 

V 

^qAs* ^^q* As* ^q* As. 
This shows that {s* ,q*) is a saddle point. 
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